2025-10-20

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
ReLANCE: A Resource-Efficient Low-Latency Cortical Neural Acceleration Engine	Sonu Kumar, Arjun S. Nair, Bhawna Chaudhary, Mukul Lokhande, Santosh Kumar Vishvakarma	2025-10-20	下载	We present a Cortical Neural Pool (CNP) architecture featuring a high-speed, resource-efficient CORDIC based Hodgkin-Huxley (RCHH) neuron model.
RAS: A Bit-Exact rANS Accelerator For High-Performance Neural Lossless Compression	Yuchao Qin, Anjunyi Fan, Bonan Yan	2025-10-20	下载	Data centers handle vast volumes of data that require efficient lossless compression, yet emerging probabilistic models based methods are often computationally slow.
SmaRTLy: RTL Optimization with Logic Inferencing and Structural Rebuilding	Chengxi Li, Yang Sun, Lei Chen, Yiwen Wang, Mingxuan Yuan, Evangeline F. Y. Young	2025-10-20	下载	This paper proposes smaRTLy: a new optimization technique for multiplexers in Register-Transfer Level (RTL) logic synthesis. Multiplexer trees are very common in RTL designs, and traditional tools lik...
SOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inference	Wenxun Wang, Shuchang Zhou, Wenyu Sun, Peiqin Sun, Yongpan Liu	2025-10-20	下载	Transformers have shown remarkable performance in both natural language processing (NLP) and computer vision (CV) tasks. However, their real-time inference speed and efficiency are limited due to the ...
YAP+: Pad-Layout-Aware Yield Modeling and Simulation for Hybrid Bonding	Zhichao Chen, Puneet Gupta	2025-10-20	下载	Three-dimensional (3D) integration continues to advance Moore's Law by facilitating dense interconnects and enabling multi-tier system architectures.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Efficient Multi-Worker Selection based Distributed Swarm Learning via Analog Aggregation	Zhuoyu Yao, Yue Wang, Songyang Zhang, Yingshu Li, Zhipeng Cai, Zhi Tian	2025-10-20	下载	Recent advances in distributed learning systems have introduced effective solutions for implementing collaborative artificial intelligence techniques in wireless communication networks.
Efficient Long-context Language Model Training by Core Attention Disaggregation	Yonghao Zhuang, Junda Chen, Bo Pang, Yi Gu, Yibo Zhu, Yimin Jiang, Ion Stoica, Eric Xing, Hao Zhang	2025-10-20	下载	We present core attention disaggregation (CAD), a technique that improves long-context large language model training by decoupling the core attention computation, softmax(QK^T)V, from the rest of the ...
A New Broadcast Model for Several Network Topologies	Hongbo Lu, Junsung Hwang, Bernard Tenreiro, Nabila Jaman Tripti, Darren Hamilton, Yuefan Deng	2025-10-20	下载	We present Broadcast by Balanced Saturation (BBS), a general broadcast algorithm designed to optimize communication efficiency across diverse network topologies.
AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators	Jacopo Tagliabue	2025-10-20	下载	We explore AI-driven distributed-systems policy design by combining stochastic code generation from large language models (LLMs) with deterministic verification in a domain-specific simulator.
Quantum Federated Learning: Architectural Elements and Future Directions	Siva Sai, Abhishek Sawaika, Prabhjot Singh, Rajkumar Buyya	2025-10-20	下载	Federated learning (FL) focuses on collaborative model training without the need to move the private data silos to a central server. Despite its several benefits, the classical FL is plagued with seve...
On the Universality of Round Elimination Fixed Points	Alkida Balliu, Sebastian Brandt, Ole Gabsdil, Dennis Olivetti, Jukka Suomela	2025-10-20	下载	Recent work on distributed graph algorithms [e.g. STOC 2022, ITCS 2022, PODC 2020] has drawn attention to the following open question: are round elimination fixed points a universal technique for prov...
DynaKV: Enabling Accurate and Efficient Long-Sequence LLM Decoding on Smartphones	Tuowei Wang, Minxing Huang, Fengzu Li, Ligeng Chen, Jinrui Zhang, Ju Ren	2025-10-20	下载	As the demand for human-like reasoning, multi-turn dialogues, and long-form responses grows, large language models (LLMs) are increasingly expected to support efficient and effective long-sequence dec...
Network and Systems Performance Characterization of MCP-Enabled LLM Agents	Zihao Ding, Mufeng Zhu, Yao Liu	2025-10-20	下载	Model Context Protocol (MCP) has recently gained increased attention within the AI community for providing a standardized way for large language models (LLMs) to interact with external tools and servi...
Integrating Performance Tools in Model Reasoning for GPU Kernel Optimization	Daniel Nichols, Konstantinos Parasyris, Charles Jekel, Abhinav Bhatele, Harshitha Menon	2025-10-20	下载	Language models are now prevalent in software engineering with many developers using them to automate tasks and accelerate their development. While language models have been tremendous at accomplishin...
An Evaluation of LLMs Inference on Popular Single-board Computers	Tung, Nguyen, Tuyen Nguyen	2025-10-20	下载	The growing demand for on-device large language model (LLM) inference is driving interest in deploying lightweight, cost-effective AI solutions on edge hardware.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Black-Box Evasion Attacks on Data-Driven Open RAN Apps: Tailored Design and Experimental Evaluation	Pranshav Gajjar, Molham Khoja, Abiodun Ganiyu, Marc Juarez, Mahesh K. Marina, Andrew Lehane, Vijay K. Shah	2025-10-20	下载	The impending adoption of Open Radio Access Network (O-RAN) is fueling innovation in the RAN towards data-driven operation. Unlike traditional RAN where the RAN data and its usage is restricted within...
A New Broadcast Model for Several Network Topologies	Hongbo Lu, Junsung Hwang, Bernard Tenreiro, Nabila Jaman Tripti, Darren Hamilton, Yuefan Deng	2025-10-20	下载	We present Broadcast by Balanced Saturation (BBS), a general broadcast algorithm designed to optimize communication efficiency across diverse network topologies.
Pointing-Error-Induced Fading in an Open-Loop THz Uplink with Hardware Impairments	P. Brach del Prever, P. Testolina, A. Masihi, S. Petrushkevich, M. Polese, T. Melodia, J. M. Jornet	2025-10-20	下载	We analyze the open-loop mechanical tracking performance of a sub-Terahertz (sub-THz) and Terahertz (THz) uplink communication system. These high-frequency bands enable multi-gigabit links through lar...
Adaptive Local Combining with Decentralized Decoding for Distributed Massive MIMO	Mohd Saif Ali Khan, Karthik RM, Samar Agnihotri	2025-10-20	下载	Efficient uplink processing in distributed massive multiple-input multiple-output (D-mMIMO) systems requires both effective local combining and scalable decoding to significantly mitigate inter-user i...
Is It Worth to Use Feedback Channel in 5G V2X Platoon Scenarios?	Dmitry Bankov, Artem Krasilov, Artem Otmakhov, Pavel Savlukovich, Evgeny Khorov	2025-10-20	下载	5G Vehicle-to-Everything (V2X) is a new technology developed by 3GPP to support inter-vehicle communication. In contrast to 4G V2X which allows only broadcast communication, 5G V2X enables groupcast a...
Enhancing 5G V2X Mode 2 for Sporadic Traffic	Dmitry Bankov, Artem Krasilov, Artem Otmakhov, Aleksei Shashin, Evgeny Khorov	2025-10-20	下载	The emerging road safety and autonomous vehicle applications require timely and reliable data delivery between vehicles and between vehicles and infrastructure.
AoA Services in 5G Networks: A Framework for Real-World Implementation and Systematic Testing	Alberto Ceresoli, Viola Bernazzoli, Roberto Pegurri, Ilario Filippini	2025-10-20	下载	Accurate positioning is a key enabler for emerging 5G applications. While the standardized Location Management Function (LMF) operates centrally within the core network, its scalability and latency li...
Network and Systems Performance Characterization of MCP-Enabled LLM Agents	Zihao Ding, Mufeng Zhu, Yao Liu	2025-10-20	下载	Model Context Protocol (MCP) has recently gained increased attention within the AI community for providing a standardized way for large language models (LLMs) to interact with external tools and servi...
Mamba4Net: Distilled Hybrid Mamba Large Language Models For Networking	Linhan Xia, Mingzhan Yang, Jingjing Wang, Ziwei Yan, Yakun Ren, Guo Yu, Kai Lei	2025-10-20	下载	Transformer-based large language models (LLMs) are increasingly being adopted in networking research to address domain-specific challenges. However, their quadratic time complexity and substantial mod...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Insum: Sparse GPU Kernels Simplified and Optimized with Indirect Einsums	Jaeyeon Won, Willow Ahrens, Joel S. Emer, Saman Amarasinghe	2025-10-20	下载	Programming high-performance sparse GPU kernels is notoriously difficult, requiring both substantial effort and deep expertise. Sparse compilers aim to simplify this process, but existing systems fall...
Integrating Performance Tools in Model Reasoning for GPU Kernel Optimization	Daniel Nichols, Konstantinos Parasyris, Charles Jekel, Abhinav Bhatele, Harshitha Menon	2025-10-20	下载	Language models are now prevalent in software engineering with many developers using them to automate tasks and accelerate their development. While language models have been tremendous at accomplishin...