Skip to content

2025-05-29

cs.AR - Architecture

标题作者发布日期PDF摘要
Cache Your Prompt When It's Green: Carbon-Aware Caching for Large Language Model ServingYuyang Tian, Desen Sun, Yi Ding, Sihang Liu2025-05-29下载As large language models (LLMs) become widely used, their environmental impact, especially carbon emission, has attracted more attention. Prior studies focus on compute-related carbon emissions.
A Unified Framework for Mapping and Synthesis of Approximate R-Blocks CGRAsGeorgios Alexandris, Panagiotis Chaidos, Alexis Maras, Barry de Bruin, Manil Dev Gomony, Henk Corporaal, Dimitrios Soudris, Sotirios Xydis2025-05-29下载The ever-increasing complexity and operational diversity of modern Neural Networks (NNs) have caused the need for low-power and, at the same time, high-performance edge devices for AI applications.
A Novel Cost-Effective MIMO Architecture with Ray Antenna Array for Enhanced Wireless Communication PerformanceZhenjun Dong, Zhiwen Zhou, Yong Zeng2025-05-29下载This paper proposes a novel multi-antenna architecture, termed ray antenna array (RAA), which practically enables flexible beamforming and also enhances wireless communication performance for high fre...
Energy-Efficient QoS-Aware Scheduling for S-NUCA Many-CoresSudam M. Wasala, Jurre Wolff, Yixian Shen, Anuj Pathania, Clemens Grelck, Andy D. Pimentel2025-05-29下载Optimizing performance and energy efficiency in many-core processors, especially within Non-Uniform Cache Access (NUCA) architectures, remains a critical challenge.
Towards LLM-based Generation of Human-Readable Proofs in Polynomial Formal VerificationRolf Drechsler2025-05-29下载Verification is one of the central tasks in circuit and system design. While simulation and emulation are widely used, complete correctness can only be ensured based on formal proof techniques.
DX100: A Programmable Data Access Accelerator for IndirectionAlireza Khadem, Kamalavasan Kamalakkannan, Zhenyan Zhu, Akash Poptani, Yufeng Gu, Jered Benjamin Dominguez-Trujillo, Nishil Talati, Daichi Fujiki, Scott Mahlke, Galen Shipman, Reetuparna Das2025-05-29下载Indirect memory accesses frequently appear in applications where memory bandwidth is a critical bottleneck. Prior indirect memory access proposals, such as indirect prefetchers, runahead execution, fe...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Cache Your Prompt When It's Green: Carbon-Aware Caching for Large Language Model ServingYuyang Tian, Desen Sun, Yi Ding, Sihang Liu2025-05-29下载As large language models (LLMs) become widely used, their environmental impact, especially carbon emission, has attracted more attention. Prior studies focus on compute-related carbon emissions.
From Connectivity to Autonomy: The Dawn of Self-Evolving Communication SystemsZeinab Nezami, Syed Danial Ali Shah, Maryam Hafeez, Karim Djemame, Syed Ali Raza Zaidi2025-05-29下载This paper envisions 6G as a self-evolving telecom ecosystem, where AI-driven intelligence enables dynamic adaptation beyond static connectivity.
Distributed Federated Learning for Vehicular Network Security: Anomaly Detection Benefits and Multi-Domain Attack ThreatsUtku Demir, Yalin E. Sagduyu, Tugba Erpek, Hossein Jafari, Sastry Kompella, Mengran Xue2025-05-29下载In connected and autonomous vehicles, machine learning for safety message classification has become critical for detecting malicious or anomalous behavior.
Complementary Time-Space Tradeoff for Self-Stabilizing Leader Election: Polynomial States Meet Sublinear TimeYuichi Sudo2025-05-29下载We study the self-stabilizing leader election (SS-LE) problem in the population protocol model, assuming exact knowledge of the population size nn.
Accelerated Training of Federated Learning via Second-Order MethodsMrinmay Sen, Sidhant R Nair, C Krishna Mohan2025-05-29下载This paper explores second-order optimization methods in Federated Learning (FL), addressing the critical challenges of slow convergence and the excessive communication rounds required to achieve opti...
Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud DatacentersHayden Moore, Sirui Qi, Ninad Hogade, Dejan Milojicic, Cullen Bash, Sudeep Pasricha2025-05-29下载In recent years, Large Language Models (LLM) such as ChatGPT, CoPilot, and Gemini have been widely adopted in different areas. As the use of LLMs continues to grow, many efforts have focused on reduci...
Efficient AllReduce with StragglersArjun Devraj, Eric Ding, Abhishek Vijaya Kumar, Robert Kleinberg, Rachee Singh2025-05-29下载Distributed machine learning workloads use data and tensor parallelism for training and inference, both of which rely on the AllReduce collective to synchronize gradients or activations.
D-Rex: Heterogeneity-Aware Reliability Framework and Adaptive Algorithms for Distributed StorageMaxime Gonthier, Dante D. Sanchez-Gallegos, Haochen Pan, Bogdan Nicolae, Sicheng Zhou, Hai Duc Nguyen, Valerie Hayot-Sasson, J. Gregory Pauloski, Jesus Carretero, Kyle Chard, Ian Foster2025-05-29下载The exponential growth of data necessitates distributed storage models, such as peer-to-peer systems and data federations. While distributed storage can reduce costs and increase reliability, the hete...
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job SchedulingPrachi Jadhav, Hongwei Jin, Ewa Deelman, Prasanna Balaprakash2025-05-29下载High-Performance Computing (HPC) job scheduling involves balancing conflicting objectives such as minimizing makespan, reducing wait times, optimizing resource use, and ensuring fairness.
NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMsHaeun Lee, Omin Kwon, Yeonhong Park, Jae W. Lee2025-05-29下载Meeting service-level objectives (SLOs) in Large Language Models (LLMs) serving is critical, but managing the high variability in load presents a significant challenge.
SealOS+: A Sealos-based Approach for Adaptive Resource Optimization Under Dynamic Workloads for Securities Trading SystemHaojie Jia, Zhenhao Li, Gen Li, Minxian Xu, Kejiang Ye2025-05-29下载As securities trading systems transition to a microservices architecture, optimizing system performance presents challenges such as inefficient resource scheduling and high service response delays.
MemAscend: System Memory Optimization for SSD-Offloaded LLM Fine-TuningYong-Cheng Liaw, Shuo-Han Chen2025-05-29下载Owing to the huge success of generative artificial intelligence (AI), large language models (LLMs) have emerged as a core subclass, underpinning applications such as question answering, text generatio...
Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core ParallelismJinhui Wei, Ye Huang, Yuhui Zhou, Jiazhi Jiang, Jiangsu Du, Yutong Lu2025-05-29下载In-situ LLM inference on end-user devices has gained significant interest due to its privacy benefits and reduced dependency on external infrastructure.
The Panaceas for Improving Low-Rank Decomposition in Communication-Efficient Federated LearningShiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Shijie Xu, Weihong Luo, Yuhua Li, Xiuqiang He, Ruixuan Li2025-05-29下载To improve the training efficiency of federated learning (FL), previous research has employed low-rank decomposition techniques to reduce communication overhead.
DOPPLER: Dual-Policy Learning for Device Assignment in Asynchronous Dataflow GraphsXinyu Yao, Daniel Bourgeois, Abhinav Jain, Yuxin Tang, Jiawen Yao, Zhimin Ding, Arlei Silva, Chris Jermaine2025-05-29下载We study the problem of assigning operations in a dataflow graph to devices to minimize execution time in a work-conserving system, with emphasis on complex machine learning workloads.
Speeding up Model Loading with fastsafetensorsTakeshi Yoshimura, Tatsuhiro Chiba, Manish Sethi, Daniel Waddington, Swaminathan Sundararaman2025-05-29下载The rapid increases in model parameter sizes introduces new challenges in pre-trained model loading. Currently, machine learning code often deserializes each parameter as a tensor object in host memor...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
NASP: Network Slice as a Service Platform for 5G NetworksFelipe Hauschild Grings, Gustavo Zanatta Bruno, Lucio Rene Prade, Cristiano Bonato Both, José Marcos Camara Brito2025-05-29下载With 5G's rapid global uptake, demand for agile private networks has exploded. A defining beyond-5G capability is network slicing. 3GPP specifies three core slice categories, massive Machine-Type Comm...
A tertiary review on quantum cryptographyLuiz Filipi Anderson de Sousa Moura, Carlos Becker Westphall2025-05-29下载Quantum computers impose an immense threat to system security. As a countermeasure, new cryptographic classes have been created to prevent these attacks.
Searching Neural Architectures for Sensor Nodes on IoT GatewaysAndrea Mattia Garavagno, Edoardo Ragusa, Antonio Frisoli, Paolo Gastaldo2025-05-29下载This paper presents an automatic method for the design of Neural Networks (NNs) at the edge, enabling Machine Learning (ML) access even in privacy-sensitive Internet of Things (IoT) applications.
Distributed Federated Learning for Vehicular Network Security: Anomaly Detection Benefits and Multi-Domain Attack ThreatsUtku Demir, Yalin E. Sagduyu, Tugba Erpek, Hossein Jafari, Sastry Kompella, Mengran Xue2025-05-29下载In connected and autonomous vehicles, machine learning for safety message classification has become critical for detecting malicious or anomalous behavior.
Towards A Global Quantum Internet: A Review of Challenges Facing Aerial Quantum NetworksNitin Jha, Abhishek Parakh2025-05-29下载Quantum networks use principles of quantum physics to create secure communication networks. Moving these networks off the ground using drones, balloons, or satellites could help increase the scalabili...
Quantum Hilbert TransformNitin Jha, Abhishek Parakh2025-05-29下载The Hilbert transform has been one of the foundational transforms in signal processing, finding it's way into multiple disciplines from cryptography to biomedical sciences.
Wireless Agentic AI with Retrieval-Augmented Multimodal Semantic PerceptionGuangyuan Liu, Yinqiu Liu, Ruichen Zhang, Hongyang Du, Dusit Niyato, Zehui Xiong, Sumei Sun, Abbas Jamalipour2025-05-29下载The rapid development of multimodal AI and Large Language Models (LLMs) has greatly enhanced real-time interaction, decision-making, and collaborative tasks.
Context-Aware Semantic Communication for the Wireless NetworksGuangyuan Liu, Yinqiu Liu, Jiacheng Wang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour2025-05-29下载In next-generation wireless networks, supporting real-time applications such as augmented reality, autonomous driving, and immersive Metaverse services demands stringent constraints on bandwidth, late...
LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical FabricsAbhishek Vijaya Kumar, Eric Ding, Arjun Devraj, Darius Bunandar, Rachee Singh2025-05-29下载When accelerators fail in modern ML datacenters, operators migrate the affected ML training or inference jobs to entirely new racks. This approach, while preserving network performance, is highly inef...
Agile Orchestration at Will: An Entire Smart Service-Based Security Architecture Towards 6GZhuoran Duan, Guoshun Nan, Rushan Li, Zijun Wang, Lihua Xiong, Chaoying Yuan, Guorong Liu, Hui Xu, Qimei Cui, Xiaofeng Tao, Tony Q. S. Quek2025-05-29下载The upcoming 6G will fundamentally reshape mobile networks beyond communications, unlocking a multitude of applications that were once considered unimaginable.

cs.PF - Performance

标题作者发布日期PDF摘要
Knowledge Distillation for Reservoir-based Classifier: Human Activity RecognitionMasaharu Kagiyama, Tsuyoshi Okita2025-05-29下载This paper aims to develop an energy-efficient classifier for time-series data by introducing PatchEchoClassifier, a novel model that leverages a reservoir-based mechanism known as the Echo State Netw...

基于 VitePress 构建