Appearance
2025-10-20
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ReLANCE: A Resource-Efficient Low-Latency Cortical Neural Acceleration Engine | Sonu Kumar, Arjun S. Nair, Bhawna Chaudhary, Mukul Lokhande, Santosh Kumar Vishvakarma | 2025-10-20 | 下载 | We present a Cortical Neural Pool (CNP) architecture featuring a high-speed, resource-efficient CORDIC based Hodgkin-Huxley (RCHH) neuron model. |
| RAS: A Bit-Exact rANS Accelerator For High-Performance Neural Lossless Compression | Yuchao Qin, Anjunyi Fan, Bonan Yan | 2025-10-20 | 下载 | Data centers handle vast volumes of data that require efficient lossless compression, yet emerging probabilistic models based methods are often computationally slow. |
| SmaRTLy: RTL Optimization with Logic Inferencing and Structural Rebuilding | Chengxi Li, Yang Sun, Lei Chen, Yiwen Wang, Mingxuan Yuan, Evangeline F. Y. Young | 2025-10-20 | 下载 | This paper proposes smaRTLy: a new optimization technique for multiplexers in Register-Transfer Level (RTL) logic synthesis. Multiplexer trees are very common in RTL designs, and traditional tools lik... |
| SOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inference | Wenxun Wang, Shuchang Zhou, Wenyu Sun, Peiqin Sun, Yongpan Liu | 2025-10-20 | 下载 | Transformers have shown remarkable performance in both natural language processing (NLP) and computer vision (CV) tasks. However, their real-time inference speed and efficiency are limited due to the ... |
| YAP+: Pad-Layout-Aware Yield Modeling and Simulation for Hybrid Bonding | Zhichao Chen, Puneet Gupta | 2025-10-20 | 下载 | Three-dimensional (3D) integration continues to advance Moore's Law by facilitating dense interconnects and enabling multi-tier system architectures. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Efficient Multi-Worker Selection based Distributed Swarm Learning via Analog Aggregation | Zhuoyu Yao, Yue Wang, Songyang Zhang, Yingshu Li, Zhipeng Cai, Zhi Tian | 2025-10-20 | 下载 | Recent advances in distributed learning systems have introduced effective solutions for implementing collaborative artificial intelligence techniques in wireless communication networks. |
| Efficient Long-context Language Model Training by Core Attention Disaggregation | Yonghao Zhuang, Junda Chen, Bo Pang, Yi Gu, Yibo Zhu, Yimin Jiang, Ion Stoica, Eric Xing, Hao Zhang | 2025-10-20 | 下载 | We present core attention disaggregation (CAD), a technique that improves long-context large language model training by decoupling the core attention computation, softmax(QK^T)V, from the rest of the ... |
| A New Broadcast Model for Several Network Topologies | Hongbo Lu, Junsung Hwang, Bernard Tenreiro, Nabila Jaman Tripti, Darren Hamilton, Yuefan Deng | 2025-10-20 | 下载 | We present Broadcast by Balanced Saturation (BBS), a general broadcast algorithm designed to optimize communication efficiency across diverse network topologies. |
| AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators | Jacopo Tagliabue | 2025-10-20 | 下载 | We explore AI-driven distributed-systems policy design by combining stochastic code generation from large language models (LLMs) with deterministic verification in a domain-specific simulator. |
| Quantum Federated Learning: Architectural Elements and Future Directions | Siva Sai, Abhishek Sawaika, Prabhjot Singh, Rajkumar Buyya | 2025-10-20 | 下载 | Federated learning (FL) focuses on collaborative model training without the need to move the private data silos to a central server. Despite its several benefits, the classical FL is plagued with seve... |
| On the Universality of Round Elimination Fixed Points | Alkida Balliu, Sebastian Brandt, Ole Gabsdil, Dennis Olivetti, Jukka Suomela | 2025-10-20 | 下载 | Recent work on distributed graph algorithms [e.g. STOC 2022, ITCS 2022, PODC 2020] has drawn attention to the following open question: are round elimination fixed points a universal technique for prov... |
| DynaKV: Enabling Accurate and Efficient Long-Sequence LLM Decoding on Smartphones | Tuowei Wang, Minxing Huang, Fengzu Li, Ligeng Chen, Jinrui Zhang, Ju Ren | 2025-10-20 | 下载 | As the demand for human-like reasoning, multi-turn dialogues, and long-form responses grows, large language models (LLMs) are increasingly expected to support efficient and effective long-sequence dec... |
| Network and Systems Performance Characterization of MCP-Enabled LLM Agents | Zihao Ding, Mufeng Zhu, Yao Liu | 2025-10-20 | 下载 | Model Context Protocol (MCP) has recently gained increased attention within the AI community for providing a standardized way for large language models (LLMs) to interact with external tools and servi... |
| Integrating Performance Tools in Model Reasoning for GPU Kernel Optimization | Daniel Nichols, Konstantinos Parasyris, Charles Jekel, Abhinav Bhatele, Harshitha Menon | 2025-10-20 | 下载 | Language models are now prevalent in software engineering with many developers using them to automate tasks and accelerate their development. While language models have been tremendous at accomplishin... |
| An Evaluation of LLMs Inference on Popular Single-board Computers | Tung, Nguyen, Tuyen Nguyen | 2025-10-20 | 下载 | The growing demand for on-device large language model (LLM) inference is driving interest in deploying lightweight, cost-effective AI solutions on edge hardware. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Black-Box Evasion Attacks on Data-Driven Open RAN Apps: Tailored Design and Experimental Evaluation | Pranshav Gajjar, Molham Khoja, Abiodun Ganiyu, Marc Juarez, Mahesh K. Marina, Andrew Lehane, Vijay K. Shah | 2025-10-20 | 下载 | The impending adoption of Open Radio Access Network (O-RAN) is fueling innovation in the RAN towards data-driven operation. Unlike traditional RAN where the RAN data and its usage is restricted within... |
| A New Broadcast Model for Several Network Topologies | Hongbo Lu, Junsung Hwang, Bernard Tenreiro, Nabila Jaman Tripti, Darren Hamilton, Yuefan Deng | 2025-10-20 | 下载 | We present Broadcast by Balanced Saturation (BBS), a general broadcast algorithm designed to optimize communication efficiency across diverse network topologies. |
| Pointing-Error-Induced Fading in an Open-Loop THz Uplink with Hardware Impairments | P. Brach del Prever, P. Testolina, A. Masihi, S. Petrushkevich, M. Polese, T. Melodia, J. M. Jornet | 2025-10-20 | 下载 | We analyze the open-loop mechanical tracking performance of a sub-Terahertz (sub-THz) and Terahertz (THz) uplink communication system. These high-frequency bands enable multi-gigabit links through lar... |
| Adaptive Local Combining with Decentralized Decoding for Distributed Massive MIMO | Mohd Saif Ali Khan, Karthik RM, Samar Agnihotri | 2025-10-20 | 下载 | Efficient uplink processing in distributed massive multiple-input multiple-output (D-mMIMO) systems requires both effective local combining and scalable decoding to significantly mitigate inter-user i... |
| Is It Worth to Use Feedback Channel in 5G V2X Platoon Scenarios? | Dmitry Bankov, Artem Krasilov, Artem Otmakhov, Pavel Savlukovich, Evgeny Khorov | 2025-10-20 | 下载 | 5G Vehicle-to-Everything (V2X) is a new technology developed by 3GPP to support inter-vehicle communication. In contrast to 4G V2X which allows only broadcast communication, 5G V2X enables groupcast a... |
| Enhancing 5G V2X Mode 2 for Sporadic Traffic | Dmitry Bankov, Artem Krasilov, Artem Otmakhov, Aleksei Shashin, Evgeny Khorov | 2025-10-20 | 下载 | The emerging road safety and autonomous vehicle applications require timely and reliable data delivery between vehicles and between vehicles and infrastructure. |
| AoA Services in 5G Networks: A Framework for Real-World Implementation and Systematic Testing | Alberto Ceresoli, Viola Bernazzoli, Roberto Pegurri, Ilario Filippini | 2025-10-20 | 下载 | Accurate positioning is a key enabler for emerging 5G applications. While the standardized Location Management Function (LMF) operates centrally within the core network, its scalability and latency li... |
| Network and Systems Performance Characterization of MCP-Enabled LLM Agents | Zihao Ding, Mufeng Zhu, Yao Liu | 2025-10-20 | 下载 | Model Context Protocol (MCP) has recently gained increased attention within the AI community for providing a standardized way for large language models (LLMs) to interact with external tools and servi... |
| Mamba4Net: Distilled Hybrid Mamba Large Language Models For Networking | Linhan Xia, Mingzhan Yang, Jingjing Wang, Ziwei Yan, Yakun Ren, Guo Yu, Kai Lei | 2025-10-20 | 下载 | Transformer-based large language models (LLMs) are increasingly being adopted in networking research to address domain-specific challenges. However, their quadratic time complexity and substantial mod... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Insum: Sparse GPU Kernels Simplified and Optimized with Indirect Einsums | Jaeyeon Won, Willow Ahrens, Joel S. Emer, Saman Amarasinghe | 2025-10-20 | 下载 | Programming high-performance sparse GPU kernels is notoriously difficult, requiring both substantial effort and deep expertise. Sparse compilers aim to simplify this process, but existing systems fall... |
| Integrating Performance Tools in Model Reasoning for GPU Kernel Optimization | Daniel Nichols, Konstantinos Parasyris, Charles Jekel, Abhinav Bhatele, Harshitha Menon | 2025-10-20 | 下载 | Language models are now prevalent in software engineering with many developers using them to automate tasks and accelerate their development. While language models have been tremendous at accomplishin... |