2025-05-29

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Cache Your Prompt When It's Green: Carbon-Aware Caching for Large Language Model Serving	Yuyang Tian, Desen Sun, Yi Ding, Sihang Liu	2025-05-29	下载	As large language models (LLMs) become widely used, their environmental impact, especially carbon emission, has attracted more attention. Prior studies focus on compute-related carbon emissions.
A Unified Framework for Mapping and Synthesis of Approximate R-Blocks CGRAs	Georgios Alexandris, Panagiotis Chaidos, Alexis Maras, Barry de Bruin, Manil Dev Gomony, Henk Corporaal, Dimitrios Soudris, Sotirios Xydis	2025-05-29	下载	The ever-increasing complexity and operational diversity of modern Neural Networks (NNs) have caused the need for low-power and, at the same time, high-performance edge devices for AI applications.
A Novel Cost-Effective MIMO Architecture with Ray Antenna Array for Enhanced Wireless Communication Performance	Zhenjun Dong, Zhiwen Zhou, Yong Zeng	2025-05-29	下载	This paper proposes a novel multi-antenna architecture, termed ray antenna array (RAA), which practically enables flexible beamforming and also enhances wireless communication performance for high fre...
Energy-Efficient QoS-Aware Scheduling for S-NUCA Many-Cores	Sudam M. Wasala, Jurre Wolff, Yixian Shen, Anuj Pathania, Clemens Grelck, Andy D. Pimentel	2025-05-29	下载	Optimizing performance and energy efficiency in many-core processors, especially within Non-Uniform Cache Access (NUCA) architectures, remains a critical challenge.
Towards LLM-based Generation of Human-Readable Proofs in Polynomial Formal Verification	Rolf Drechsler	2025-05-29	下载	Verification is one of the central tasks in circuit and system design. While simulation and emulation are widely used, complete correctness can only be ensured based on formal proof techniques.
DX100: A Programmable Data Access Accelerator for Indirection	Alireza Khadem, Kamalavasan Kamalakkannan, Zhenyan Zhu, Akash Poptani, Yufeng Gu, Jered Benjamin Dominguez-Trujillo, Nishil Talati, Daichi Fujiki, Scott Mahlke, Galen Shipman, Reetuparna Das	2025-05-29	下载	Indirect memory accesses frequently appear in applications where memory bandwidth is a critical bottleneck. Prior indirect memory access proposals, such as indirect prefetchers, runahead execution, fe...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Cache Your Prompt When It's Green: Carbon-Aware Caching for Large Language Model Serving	Yuyang Tian, Desen Sun, Yi Ding, Sihang Liu	2025-05-29	下载	As large language models (LLMs) become widely used, their environmental impact, especially carbon emission, has attracted more attention. Prior studies focus on compute-related carbon emissions.
From Connectivity to Autonomy: The Dawn of Self-Evolving Communication Systems	Zeinab Nezami, Syed Danial Ali Shah, Maryam Hafeez, Karim Djemame, Syed Ali Raza Zaidi	2025-05-29	下载	This paper envisions 6G as a self-evolving telecom ecosystem, where AI-driven intelligence enables dynamic adaptation beyond static connectivity.
Distributed Federated Learning for Vehicular Network Security: Anomaly Detection Benefits and Multi-Domain Attack Threats	Utku Demir, Yalin E. Sagduyu, Tugba Erpek, Hossein Jafari, Sastry Kompella, Mengran Xue	2025-05-29	下载	In connected and autonomous vehicles, machine learning for safety message classification has become critical for detecting malicious or anomalous behavior.
Complementary Time-Space Tradeoff for Self-Stabilizing Leader Election: Polynomial States Meet Sublinear Time	Yuichi Sudo	2025-05-29	下载	We study the self-stabilizing leader election (SS-LE) problem in the population protocol model, assuming exact knowledge of the population size $n$ .
Accelerated Training of Federated Learning via Second-Order Methods	Mrinmay Sen, Sidhant R Nair, C Krishna Mohan	2025-05-29	下载	This paper explores second-order optimization methods in Federated Learning (FL), addressing the critical challenges of slow convergence and the excessive communication rounds required to achieve opti...
Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters	Hayden Moore, Sirui Qi, Ninad Hogade, Dejan Milojicic, Cullen Bash, Sudeep Pasricha	2025-05-29	下载	In recent years, Large Language Models (LLM) such as ChatGPT, CoPilot, and Gemini have been widely adopted in different areas. As the use of LLMs continues to grow, many efforts have focused on reduci...
Efficient AllReduce with Stragglers	Arjun Devraj, Eric Ding, Abhishek Vijaya Kumar, Robert Kleinberg, Rachee Singh	2025-05-29	下载	Distributed machine learning workloads use data and tensor parallelism for training and inference, both of which rely on the AllReduce collective to synchronize gradients or activations.
D-Rex: Heterogeneity-Aware Reliability Framework and Adaptive Algorithms for Distributed Storage	Maxime Gonthier, Dante D. Sanchez-Gallegos, Haochen Pan, Bogdan Nicolae, Sicheng Zhou, Hai Duc Nguyen, Valerie Hayot-Sasson, J. Gregory Pauloski, Jesus Carretero, Kyle Chard, Ian Foster	2025-05-29	下载	The exponential growth of data necessitates distributed storage models, such as peer-to-peer systems and data federations. While distributed storage can reduce costs and increase reliability, the hete...
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling	Prachi Jadhav, Hongwei Jin, Ewa Deelman, Prasanna Balaprakash	2025-05-29	下载	High-Performance Computing (HPC) job scheduling involves balancing conflicting objectives such as minimizing makespan, reducing wait times, optimizing resource use, and ensuring fairness.
NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMs	Haeun Lee, Omin Kwon, Yeonhong Park, Jae W. Lee	2025-05-29	下载	Meeting service-level objectives (SLOs) in Large Language Models (LLMs) serving is critical, but managing the high variability in load presents a significant challenge.
SealOS+: A Sealos-based Approach for Adaptive Resource Optimization Under Dynamic Workloads for Securities Trading System	Haojie Jia, Zhenhao Li, Gen Li, Minxian Xu, Kejiang Ye	2025-05-29	下载	As securities trading systems transition to a microservices architecture, optimizing system performance presents challenges such as inefficient resource scheduling and high service response delays.
MemAscend: System Memory Optimization for SSD-Offloaded LLM Fine-Tuning	Yong-Cheng Liaw, Shuo-Han Chen	2025-05-29	下载	Owing to the huge success of generative artificial intelligence (AI), large language models (LLMs) have emerged as a core subclass, underpinning applications such as question answering, text generatio...
Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism	Jinhui Wei, Ye Huang, Yuhui Zhou, Jiazhi Jiang, Jiangsu Du, Yutong Lu	2025-05-29	下载	In-situ LLM inference on end-user devices has gained significant interest due to its privacy benefits and reduced dependency on external infrastructure.
The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated Learning	Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Shijie Xu, Weihong Luo, Yuhua Li, Xiuqiang He, Ruixuan Li	2025-05-29	下载	To improve the training efficiency of federated learning (FL), previous research has employed low-rank decomposition techniques to reduce communication overhead.
DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow Graphs	Xinyu Yao, Daniel Bourgeois, Abhinav Jain, Yuxin Tang, Jiawen Yao, Zhimin Ding, Arlei Silva, Chris Jermaine	2025-05-29	下载	We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system, with emphasis on complex machine learning workloads.
Speeding up Model Loading with fastsafetensors	Takeshi Yoshimura, Tatsuhiro Chiba, Manish Sethi, Daniel Waddington, Swaminathan Sundararaman	2025-05-29	下载	The rapid increases in model parameter sizes introduces new challenges in pre-trained model loading. Currently, machine learning code often deserializes each parameter as a tensor object in host memor...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
NASP: Network Slice as a Service Platform for 5G Networks	Felipe Hauschild Grings, Gustavo Zanatta Bruno, Lucio Rene Prade, Cristiano Bonato Both, José Marcos Camara Brito	2025-05-29	下载	With 5G's rapid global uptake, demand for agile private networks has exploded. A defining beyond-5G capability is network slicing. 3GPP specifies three core slice categories, massive Machine-Type Comm...
A tertiary review on quantum cryptography	Luiz Filipi Anderson de Sousa Moura, Carlos Becker Westphall	2025-05-29	下载	Quantum computers impose an immense threat to system security. As a countermeasure, new cryptographic classes have been created to prevent these attacks.
Searching Neural Architectures for Sensor Nodes on IoT Gateways	Andrea Mattia Garavagno, Edoardo Ragusa, Antonio Frisoli, Paolo Gastaldo	2025-05-29	下载	This paper presents an automatic method for the design of Neural Networks (NNs) at the edge, enabling Machine Learning (ML) access even in privacy-sensitive Internet of Things (IoT) applications.
Distributed Federated Learning for Vehicular Network Security: Anomaly Detection Benefits and Multi-Domain Attack Threats	Utku Demir, Yalin E. Sagduyu, Tugba Erpek, Hossein Jafari, Sastry Kompella, Mengran Xue	2025-05-29	下载	In connected and autonomous vehicles, machine learning for safety message classification has become critical for detecting malicious or anomalous behavior.
Towards A Global Quantum Internet: A Review of Challenges Facing Aerial Quantum Networks	Nitin Jha, Abhishek Parakh	2025-05-29	下载	Quantum networks use principles of quantum physics to create secure communication networks. Moving these networks off the ground using drones, balloons, or satellites could help increase the scalabili...
Quantum Hilbert Transform	Nitin Jha, Abhishek Parakh	2025-05-29	下载	The Hilbert transform has been one of the foundational transforms in signal processing, finding it's way into multiple disciplines from cryptography to biomedical sciences.
Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic Perception	Guangyuan Liu, Yinqiu Liu, Ruichen Zhang, Hongyang Du, Dusit Niyato, Zehui Xiong, Sumei Sun, Abbas Jamalipour	2025-05-29	下载	The rapid development of multimodal AI and Large Language Models (LLMs) has greatly enhanced real-time interaction, decision-making, and collaborative tasks.
Context-Aware Semantic Communication for the Wireless Networks	Guangyuan Liu, Yinqiu Liu, Jiacheng Wang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour	2025-05-29	下载	In next-generation wireless networks, supporting real-time applications such as augmented reality, autonomous driving, and immersive Metaverse services demands stringent constraints on bandwidth, late...
LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics	Abhishek Vijaya Kumar, Eric Ding, Arjun Devraj, Darius Bunandar, Rachee Singh	2025-05-29	下载	When accelerators fail in modern ML datacenters, operators migrate the affected ML training or inference jobs to entirely new racks. This approach, while preserving network performance, is highly inef...
Agile Orchestration at Will: An Entire Smart Service-Based Security Architecture Towards 6G	Zhuoran Duan, Guoshun Nan, Rushan Li, Zijun Wang, Lihua Xiong, Chaoying Yuan, Guorong Liu, Hui Xu, Qimei Cui, Xiaofeng Tao, Tony Q. S. Quek	2025-05-29	下载	The upcoming 6G will fundamentally reshape mobile networks beyond communications, unlocking a multitude of applications that were once considered unimaginable.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Knowledge Distillation for Reservoir-based Classifier: Human Activity Recognition	Masaharu Kagiyama, Tsuyoshi Okita	2025-05-29	下载	This paper aims to develop an energy-efficient classifier for time-series data by introducing PatchEchoClassifier, a novel model that leverages a reservoir-based mechanism known as the Echo State Netw...