2025-11-06

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
MDM: Manhattan Distance Mapping of DNN Weights for Parasitic-Resistance-Resilient Memristive Crossbars	Matheus Farias, Wanghley Martins, H. T. Kung	2025-11-06	下载	Manhattan Distance Mapping (MDM) is a post-training deep neural network (DNN) weight mapping technique for memristive bit-sliced compute-in-memory (CIM) crossbars that reduces parasitic resistance (PR...
SLOFetch: Compressed-Hierarchical Instruction Prefetching for Cloud Microservices	Zerui Bao, Di Zhu, Liu Jiang, Shiqi Sheng, Ziwei Wang, Haoyun Zhang	2025-11-06	下载	Large-scale networked services rely on deep soft-ware stacks and microservice orchestration, which increase instruction footprints and create frontend stalls that inflate tail latency and energy.
FuseFlow: A Fusion-Centric Compilation Framework for Sparse Deep Learning on Streaming Dataflow	Rubens Lacouture, Nathan Zhang, Ritvik Sharma, Marco Siracusa, Fredrik Kjolstad, Kunle Olukotun, Olivia Hsu	2025-11-06	下载	As deep learning models scale, sparse computation and specialized dataflow hardware have emerged as powerful solutions to address efficiency. We propose FuseFlow, a compiler that converts sparse machi...
Scalable and Efficient Intra- and Inter-node Interconnection Networks for Post-Exascale Supercomputers and Data centers	Joaquin Tarraga-Moreno, Daniel Barley, Francisco J. Andujar Munoz, Jesus Escudero-Sahuquillo, Holger Froning, Pedro Javier Garcia, Francisco J. Quiles, Jose Duato	2025-11-06	下载	The rapid growth of data-intensive applications such as generative AI, scientific simulations, and large-scale analytics is driving modern supercomputers and data centers toward increasingly heterogen...
wa-hls4ml: A Benchmark and Surrogate Models for hls4ml Resource and Latency Estimation	Benjamin Hawks, Jason Weitz, Dmitri Demler, Karla Tame-Narvaez, Dennis Plotnikov, Mohammad Mehdi Rahimifar, Hamza Ezzaoui Rahali, Audrey C. Therrien, Donovan Sproule, Elham E Khoda, Keegan A. Smith, Russell Marroquin, Giuseppe Di Guglielmo, Nhan Tran, Javier Duarte, Vladimir Loncar	2025-11-06	下载	As machine learning (ML) is increasingly implemented in hardware to address real-time challenges in scientific applications, the development of advanced toolchains has significantly reduced the time r...
Analysis of Single Event Induced Bit Faults in a Deep Neural Network Accelerator Pipeline	Naïn Jonckers, Toon Vinck, Peter Karsmakers, Jeffrey Prinzie	2025-11-06	下载	In recent years, the increased interest and the growth in application domains of Artificial Intelligence (AI), and more specifically Deep Neural Networks (DNNs), has led to an extensive usage of domai...
AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM	Yuanpeng Zhang, Xing Hu, Xi Chen, Zhihang Yuan, Cong Li, Jingchen Zhu, Zhao Wang, Chenguang Zhang, Xin Si, Wei Gao, Qiang Wu, Runsheng Wang, Guangyu Sun	2025-11-06	下载	SRAM Processing-in-Memory (PIM) has emerged as the most promising implementation for high-performance PIM, delivering superior computing density, energy efficiency, and computational precision.
FiCABU: A Fisher-Based, Context-Adaptive Machine Unlearning Processor for Edge AI	Eun-Su Cho, Jongin Choi, Jeongmin Jin, Jae-Jin Lee, Woojoo Lee	2025-11-06	下载	Machine unlearning, driven by privacy regulations and the "right to be forgotten", is increasingly needed at the edge, yet server-centric or retraining-heavy methods are impractical under tight comput...
Disaggregated Architectures and the Redesign of Data Center Ecosystems: Scheduling, Pooling, and Infrastructure Trade-offs	Chao Guo, Jiahe Xu, Moshe Zukerman	2025-11-06	下载	Hardware disaggregation seeks to transform Data Center (DC) resources from traditional server fleets into unified resource pools. Despite existing challenges that may hinder its full realization, sign...
PICNIC: Silicon Photonic Interconnected Chiplets with Computational Network and In-memory Computing for LLM Inference Acceleration	Yue Jiet Chong, Yimin Wang, Zhen Wu, Xuanyao Fong	2025-11-06	下载	This paper presents a 3D-stacked chiplets based large language model (LLM) inference accelerator, consisting of non-volatile in-memory-computing processing elements (PEs) and Inter-PE Computational Ne...
From Minutes to Seconds: Redefining the Five-Minute Rule for AI-Era Memory Hierarchies	Tong Zhang, Vikram Sharma Mailthody, Fei Sun, Linsen Ma, Chris J. Newburn, Teresa Zhang, Yang Liu, Jiangpeng Li, Hao Zhong, Wen-Mei Hwu	2025-11-06	下载	In 1987, Jim Gray and Gianfranco Putzolu introduced the five-minute rule, a simple, storage-memory-economics-based heuristic for deciding when data should live in DRAM rather than on storage.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Marionette: Data Structure Description and Management for Heterogeneous Computing	Nuno dos Santos Fernandes, Pedro Tomás, Nuno Roma, Frank Winklmeier, Patricia Conde-Muíño	2025-11-06	下载	Adapting large, object-oriented C++ codebases for hardware acceleration might be extremely challenging, particularly when targeting heterogeneous platforms such as GPUs.
Resolving Conflicts with Grace: Dynamically Concurrent Universality	Petr Kuznetsov, Nathan Josia Schrodt	2025-11-06	下载	Synchronization is the major obstacle to scalability in distributed computing. Concurrent operations on the shared data engage in synchronization when they encounter a \emph{conflict}, i.e.
A New Probabilistic Mobile Byzantine Failure Model for Self-Protecting Systems	Silvia Bonomi, Giovanni Farina, Roy Friedman, Eviatar B. Procaccia, Sebastien Tixeuil	2025-11-06	下载	Modern distributed systems face growing security threats, as attackers continuously enhance their skills and vulnerabilities span across the entire system stack, from hardware to the application layer...
Scalable Domain-decomposed Monte Carlo Neutral Transport for Nuclear Fusion	Oskar Lappi, Huw Leggate, Yannick Marandet, Jan Åström, Keijo Heljanko, Dmitriy V. Borodin	2025-11-06	下载	EIRENE [1] is a Monte Carlo neutral transport solver heavily used in the fusion community. EIRENE does not implement domain decomposition, making it impossible to use for simulations where the grid da...
Enabling Dynamic Sparsity in Quantized LLM Inference	Rongxiang Wang, Kangyuan Shu, Felix Xiaozhu Lin	2025-11-06	下载	Deploying large language models (LLMs) on end-user devices is gaining importance due to benefits in responsiveness, privacy, and operational cost.
AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs	Pedro Antunes, Ana Rita Ortigoso, Gabriel Vieira, Daniel Fuentes, Luís Frazão, Nuno Costa, António Pereira	2025-11-06	下载	The rise of Large Language Models (LLM) has increased the need for scalable, high-performance inference systems, yet most existing frameworks assume homogeneous, resource-rich hardware, often unrealis...
Parallel Spawning Strategies for Dynamic-Aware MPI Applications	Iker Martín-Álvarez, José I. Aliaga, Maribel Castillo	2025-11-06	下载	Dynamic resource management is an increasingly important capability of High Performance Computing systems, as it enables jobs to adjust their resource allocation at runtime.
A Reinforced Evolution-Based Approach to Multi-Resource Load Balancing	Leszek Sliwko	2025-11-06	下载	This paper presents a reinforced genetic approach to a defined d-resource system optimization problem. The classical evolution schema was ineffective due to a very strict feasibility function in the s...
DIAP: A Decentralized Agent Identity Protocol with Zero-Knowledge Proofs and a Hybrid P2P Stack	Yuanjie Liu, Wenpeng Xing, Ye Zhou, Gaowei Chang, Changting Lin, Meng Han	2025-11-06	下载	The absence of a fully decentralized, verifiable, and privacy-preserving communication protocol for autonomous agents remains a core challenge in decentralized computing.
Stochastic Modeling for Energy-Efficient Edge Infrastructure	Fabio Diniz Rossi	2025-11-06	下载	Edge Computing enables low-latency processing for real-time applications but introduces challenges in power management due to the distributed nature of edge devices and their limited energy resources.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Bit-Flipping Attack Exploration and Countermeasure in 5G Network	Joon Kim, Chengwei Duan, Sandip Ray	2025-11-06	下载	5G communication technology has become a vital component in a wide range of applications due to its unique advantages such as high data rate and low latency.
Improving dynamic congestion isolation in data-center networks	Alberto Merino, Jesus Escudero-Sahuquillo, Pedro Javier Garcia, Francisco J. Quiles	2025-11-06	下载	The rise of distributed AI and large-scale applications has impacted the communication operations of data-center and Supercomputer interconnection networks, leading to dramatic incast or in-network co...
Age of Job Completion Minimization with Stable Queues	Stavros Mitrolaris, Subhankar Banerjee, Sennur Ulukus	2025-11-06	下载	We consider a time-slotted job-assignment system with a central server, N users and a machine which changes its state according to a Markov chain (hence called a Markov machine).
HiFiNet: Hierarchical Fault Identification in Wireless Sensor Networks via Edge-Based Classification and Graph Aggregation	Nguyen Van Son, Nguyen Tri Nghia, Nguyen Thi Hanh, Huynh Thi Thanh Binh	2025-11-06	下载	Wireless Sensor Networks (WSN) are the backbone of essential monitoring applications, but their deployment in unfavourable conditions increases the risk to data integrity and system reliability.
AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs	Pedro Antunes, Ana Rita Ortigoso, Gabriel Vieira, Daniel Fuentes, Luís Frazão, Nuno Costa, António Pereira	2025-11-06	下载	The rise of Large Language Models (LLM) has increased the need for scalable, high-performance inference systems, yet most existing frameworks assume homogeneous, resource-rich hardware, often unrealis...
Hybrid Quantum-Classical Detection for RIS-Assisted SC-FDE via Grover Adaptive Search	Maryam Tariq, Omar Alhussein, Raneem Abdelraheem, Abdullah Quran, Georges Kaddoum, Sami Muhaidat	2025-11-06	下载	Wideband and low-latency requirements in sixth-generation (6G) networks demand detectors that approach maximum-likelihood (ML) performance without incurring exponential complexity.
Disaggregated Architectures and the Redesign of Data Center Ecosystems: Scheduling, Pooling, and Infrastructure Trade-offs	Chao Guo, Jiahe Xu, Moshe Zukerman	2025-11-06	下载	Hardware disaggregation seeks to transform Data Center (DC) resources from traditional server fleets into unified resource pools. Despite existing challenges that may hinder its full realization, sign...
OTS-PC: OTS-based Payment Channels for the Lightning Network	Sergio Demian Lerner, Ariel Futoransky	2025-11-06	下载	We present a new type of bidirectional payment channel based on One-Time Signatures on state sequence numbers. This new construction is simpler than the Poon-Dryja construction, but provides a number ...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
An MLCommons Scientific Benchmarks Ontology	Ben Hawks, Gregor von Laszewski, Matthew D. Sinclair, Marco Colombo, Shivaram Venkataraman, Rutwik Jain, Yiwei Jiang, Nhan Tran, Geoffrey Fox	2025-11-06	下载	Scientific machine learning research spans diverse domains and data modalities, yet existing benchmark efforts remain siloed and lack standardization.
Scalable Domain-decomposed Monte Carlo Neutral Transport for Nuclear Fusion	Oskar Lappi, Huw Leggate, Yannick Marandet, Jan Åström, Keijo Heljanko, Dmitriy V. Borodin	2025-11-06	下载	EIRENE [1] is a Monte Carlo neutral transport solver heavily used in the fusion community. EIRENE does not implement domain decomposition, making it impossible to use for simulations where the grid da...