2025-03-24

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
"Test, Build, Deploy" -- A CI/CD Framework for Open-Source Hardware Designs	Calvin Deutschbein, Aristotle Stassinopoulos	2025-03-24	下载	Addressing TedX, Amber Huffman made an impassioned case that "none of us is as smart as all of us" and that open-source hardware is the future.
Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design	Rui Xie, Asad Ul Haq, Linsen Ma, Yunhua Fang, Zirak Burzin Engineer, Liu Liu, Tong Zhang	2025-03-24	下载	The efficiency of Large Language Model~(LLM) inference is often constrained by substantial memory bandwidth and capacity demands. Existing techniques, such as pruning, quantization, and mixture of exp...
Efficient Trace for RISC-V: Design, Evaluation, and Integration in CVA6	Umberto Laghi, Simone Manoni, Emanuele Parisi, Andrea Bartolini	2025-03-24	下载	In this work, we present the design and evaluation of a Processor Tracing System compliant with the RISC-V Efficient Trace specification for Instruction Branch Tracing.
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache	Dayou Du, Shijie Cao, Jianyi Cheng, Luo Mai, Ting Cao, Mao Yang	2025-03-24	下载	The growth of long-context Large Language Models (LLMs) significantly increases memory and bandwidth pressure during autoregressive decoding due to the expanding Key-Value (KV) cache.
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization	Minsu Kim, Seongmin Hong, RyeoWook Ko, Soongyu Choi, Hunjong Lee, Junsoo Kim, Joo-Young Kim, Jongse Park	2025-03-24	下载	Modern Large Language Model serving system batches multiple requests to achieve high throughput, while batching attention operations is challenging, rendering memory bandwidth a critical bottleneck.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
COoL-TEE: Client-TEE Collaboration for Resilient Distributed Search	Matthieu Bettinger, Etienne Rivière, Sonia Ben Mokhtar, Anthony Simonet-Boulogne	2025-03-24	下载	Current marketplaces rely on search mechanisms with distributed systems but centralized governance, making them vulnerable to attacks, failures, censorship and biases.
Reliability is Blind: Collective Incentives for Decentralized Computing Marketplaces without Individual Behavior Information	Henry Mont, Matthieu Bettinger, Sonia Ben Mokhtar, Anthony Simonet-Boulogne	2025-03-24	下载	In decentralized cloud computing marketplaces, ensuring fair and efficient interactions among asset providers and end-users is crucial. A key concern is meeting agreed-upon service-level objectives li...
Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization	Zhanda Zhu, Christina Giannoula, Muralidhar Andoorveedu, Qidong Su, Karttikeya Mangalam, Bojian Zheng, Gennady Pekhimenko	2025-03-24	下载	Various parallelism, such as data, tensor, and pipeline parallelism, along with memory optimizations like activation checkpointing, redundancy elimination, and offloading, have been proposed to accele...
Efficient Distributed Algorithms for Shape Reduction via Reconfigurable Circuits	Nada Almalki, Siddharth Gupta, Othon Michail, Andreas Padalkin	2025-03-24	下载	Autonomous reconfiguration of agent-based systems is a key challenge in the study of programmable matter, distributed robotics, and molecular self-assembly.
cfdSCOPE: A Fluid-Dynamics Proxy App for Teaching Performance Engineering	Peter Arzt, Sebastian Kreutzer, Tim Jammer, Christian Bischof	2025-03-24	下载	Teaching performance engineering in high-performance computing (HPC) requires example codes that demonstrate bottlenecks and enable hands-on optimization.
Monte Cimone v2: Down the Road of RISC-V High-Performance Computers	Emanuele Venieri, Simone Manoni, Gabriele Ceccolini, Giacomo Madella, Federico Ficarelli, Daniele Gregori, Daniele Cesarini, Luca Benini, Andrea Bartolini	2025-03-24	下载	Many RISC-V (RV) platforms and SoCs have been announced in recent years targeting the HPC sector, but only a few of them are commercially available and engineered to fit the HPC requirements.
Õptimal Fault-Tolerant Labeling for Reachability and Approximate Distances in Directed Planar Graphs	Itai Boneh, Shiri Chechik, Shay Golan, Shay Mozes, Oren Weimann	2025-03-24	下载	We present a labeling scheme that assigns labels of size $\tilde O(1)$ to the vertices of a directed weighted planar graph $G$ , such that for any fixed $\varepsilon>0$ from the labels of any three ver...
AES-SpMM: Balancing Accuracy and Speed by Adaptive Edge Sampling Strategy to Accelerate SpMM in GNNs	Yingchen Song, Yaobin Wang, Yi Luo, Huan Wu, Pingping Tang	2025-03-24	下载	Coordinating the design of sampling and sparse-dense matrix multiplication (SpMM) is crucial for accelerating graph neural networks (GNNs). However, due to irrational sampling strategies, existing met...
ED-DAO: Energy Donation Algorithms based on Decentralized Autonomous Organization	Abdulrezzak Zekiye, Ouns Bouachir, Öznur Özkasap, Moayad Aloqaily	2025-03-24	下载	Energy is a fundamental component of modern life, driving nearly all aspects of daily activities. As such, the inability to access energy when needed is a significant issue that requires innovative so...
Jenga: Effective Memory Management for Serving LLM with Heterogeneity	Chen Zhang, Kuntai Du, Shu Liu, Woosuk Kwon, Xiangxi Mo, Yufeng Wang, Xiaoxuan Liu, Kaichao You, Zhuohan Li, Mingsheng Long, Jidong Zhai, Joseph Gonzalez, Ion Stoica	2025-03-24	下载	Large language models (LLMs) are widely used but expensive to run, especially as inference workloads grow. To lower costs, maximizing the request batch size by managing GPU memory efficiently is cruci...
Risk Management for Distributed Arbitrage Systems: Integrating Artificial Intelligence	Akaash Vishal Hazarika, Mahak Shah, Swapnil Patil, Pradyumna Shukla	2025-03-24	下载	Effective risk management solutions become absolutely crucial when financial markets embrace distributed technology and decentralized financing (DeFi).
Bridging Emotions and Architecture: Sentiment Analysis in Modern Distributed Systems	Mahak Shah, Akaash Vishal Hazarika, Meetu Malhotra, Sachin C. Patil, Joshit Mohanty	2025-03-24	下载	Sentiment analysis is a field within NLP that has gained importance because it is applied in various areas such as; social media surveillance, customer feedback evaluation and market research.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Rank-Based Modeling for Universal Packets Compression in Multi-Modal Communications	Xuanhao Luo, Zhiyuan Peng, Zhouyu Li, Ruozhou Yu, Yuchen Liu	2025-03-24	下载	The rapid increase in networked systems and data transmission requires advanced data compression solutions to optimize bandwidth utilization and enhance network performance.
Enhancing V2X Communications with UAV-mounted Reconfigurable Intelligent Surfaces	Salim Janji, Paweł Sroka, Adrian Kliks	2025-03-24	下载	This paper addresses the crucial need for reliable wireless communication in vehicular networks, particularly vital for the safety and efficacy of (semi-)autonomous driving amid increasing traffic.
Signal Propagation in RIS-Aided 5G Systems	Adam Samorzewski, Adrian Kliks	2025-03-24	下载	In this paper, we conduct an in-depth analysis of radio signal propagation characteristics within the urban environment of Poznan (Poland). The study specifically addresses the deployment of a 5th gen...
Energy-Efficient Dynamic Training and Inference for GNN-Based Network Modeling	Chetna Singhal, Yassine Hadjadj-Aoul	2025-03-24	下载	Efficient network modeling is essential for resource optimization and network planning in next-generation large-scale complex networks. Traditional approaches, such as queuing theory-based modeling an...
Periodic Chains Scheduling on Dedicated Resources -- A Crucial Problem in Time-Sensitive Networks	Josef Grus, Claire Hanen, Zdeněk Hanzálek	2025-03-24	下载	Periodic messages transfer data from sensors to actuators in cars, planes, and complex production machines. When considering a given routing, the unicast message starts at its source and goes over sev...
Real-Time Streaming Telemetry Based Detection and Mitigation of OOK and Power Interference in Multi-User OSaaS Networks	Agastya Raj, Devika Dass, Daniel C. Kilper, Marco Ruffini	2025-03-24	下载	We present a framework to identify and mitigate rogue OOK signals and user-generated power interference in a multi-user Optical-Spectrum-as-a-Service network.
Large Language Models powered Malicious Traffic Detection: Architecture, Opportunities and Case Study	Xinggong Zhang, Haotian Meng, Qingyang Li, Yunpeng Tan, Lei Zhang	2025-03-24	下载	Malicious traffic detection is a pivotal technology for network security to identify abnormal network traffic and detect network attacks. Large Language Models (LLMs) are trained on a vast corpus of t...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache	Dayou Du, Shijie Cao, Jianyi Cheng, Luo Mai, Ting Cao, Mao Yang	2025-03-24	下载	The growth of long-context Large Language Models (LLMs) significantly increases memory and bandwidth pressure during autoregressive decoding due to the expanding Key-Value (KV) cache.
cfdSCOPE: A Fluid-Dynamics Proxy App for Teaching Performance Engineering	Peter Arzt, Sebastian Kreutzer, Tim Jammer, Christian Bischof	2025-03-24	下载	Teaching performance engineering in high-performance computing (HPC) requires example codes that demonstrate bottlenecks and enable hands-on optimization.