Appearance
2025-03-24
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| "Test, Build, Deploy" -- A CI/CD Framework for Open-Source Hardware Designs | Calvin Deutschbein, Aristotle Stassinopoulos | 2025-03-24 | 下载 | Addressing TedX, Amber Huffman made an impassioned case that "none of us is as smart as all of us" and that open-source hardware is the future. |
| Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design | Rui Xie, Asad Ul Haq, Linsen Ma, Yunhua Fang, Zirak Burzin Engineer, Liu Liu, Tong Zhang | 2025-03-24 | 下载 | The efficiency of Large Language Model~(LLM) inference is often constrained by substantial memory bandwidth and capacity demands. Existing techniques, such as pruning, quantization, and mixture of exp... |
| Efficient Trace for RISC-V: Design, Evaluation, and Integration in CVA6 | Umberto Laghi, Simone Manoni, Emanuele Parisi, Andrea Bartolini | 2025-03-24 | 下载 | In this work, we present the design and evaluation of a Processor Tracing System compliant with the RISC-V Efficient Trace specification for Instruction Branch Tracing. |
| BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache | Dayou Du, Shijie Cao, Jianyi Cheng, Luo Mai, Ting Cao, Mao Yang | 2025-03-24 | 下载 | The growth of long-context Large Language Models (LLMs) significantly increases memory and bandwidth pressure during autoregressive decoding due to the expanding Key-Value (KV) cache. |
| Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization | Minsu Kim, Seongmin Hong, RyeoWook Ko, Soongyu Choi, Hunjong Lee, Junsoo Kim, Joo-Young Kim, Jongse Park | 2025-03-24 | 下载 | Modern Large Language Model serving system batches multiple requests to achieve high throughput, while batching attention operations is challenging, rendering memory bandwidth a critical bottleneck. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| COoL-TEE: Client-TEE Collaboration for Resilient Distributed Search | Matthieu Bettinger, Etienne Rivière, Sonia Ben Mokhtar, Anthony Simonet-Boulogne | 2025-03-24 | 下载 | Current marketplaces rely on search mechanisms with distributed systems but centralized governance, making them vulnerable to attacks, failures, censorship and biases. |
| Reliability is Blind: Collective Incentives for Decentralized Computing Marketplaces without Individual Behavior Information | Henry Mont, Matthieu Bettinger, Sonia Ben Mokhtar, Anthony Simonet-Boulogne | 2025-03-24 | 下载 | In decentralized cloud computing marketplaces, ensuring fair and efficient interactions among asset providers and end-users is crucial. A key concern is meeting agreed-upon service-level objectives li... |
| Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization | Zhanda Zhu, Christina Giannoula, Muralidhar Andoorveedu, Qidong Su, Karttikeya Mangalam, Bojian Zheng, Gennady Pekhimenko | 2025-03-24 | 下载 | Various parallelism, such as data, tensor, and pipeline parallelism, along with memory optimizations like activation checkpointing, redundancy elimination, and offloading, have been proposed to accele... |
| Efficient Distributed Algorithms for Shape Reduction via Reconfigurable Circuits | Nada Almalki, Siddharth Gupta, Othon Michail, Andreas Padalkin | 2025-03-24 | 下载 | Autonomous reconfiguration of agent-based systems is a key challenge in the study of programmable matter, distributed robotics, and molecular self-assembly. |
| cfdSCOPE: A Fluid-Dynamics Proxy App for Teaching Performance Engineering | Peter Arzt, Sebastian Kreutzer, Tim Jammer, Christian Bischof | 2025-03-24 | 下载 | Teaching performance engineering in high-performance computing (HPC) requires example codes that demonstrate bottlenecks and enable hands-on optimization. |
| Monte Cimone v2: Down the Road of RISC-V High-Performance Computers | Emanuele Venieri, Simone Manoni, Gabriele Ceccolini, Giacomo Madella, Federico Ficarelli, Daniele Gregori, Daniele Cesarini, Luca Benini, Andrea Bartolini | 2025-03-24 | 下载 | Many RISC-V (RV) platforms and SoCs have been announced in recent years targeting the HPC sector, but only a few of them are commercially available and engineered to fit the HPC requirements. |
| Õptimal Fault-Tolerant Labeling for Reachability and Approximate Distances in Directed Planar Graphs | Itai Boneh, Shiri Chechik, Shay Golan, Shay Mozes, Oren Weimann | 2025-03-24 | 下载 | We present a labeling scheme that assigns labels of size to the vertices of a directed weighted planar graph , such that for any fixed from the labels of any three ver... |
| AES-SpMM: Balancing Accuracy and Speed by Adaptive Edge Sampling Strategy to Accelerate SpMM in GNNs | Yingchen Song, Yaobin Wang, Yi Luo, Huan Wu, Pingping Tang | 2025-03-24 | 下载 | Coordinating the design of sampling and sparse-dense matrix multiplication (SpMM) is crucial for accelerating graph neural networks (GNNs). However, due to irrational sampling strategies, existing met... |
| ED-DAO: Energy Donation Algorithms based on Decentralized Autonomous Organization | Abdulrezzak Zekiye, Ouns Bouachir, Öznur Özkasap, Moayad Aloqaily | 2025-03-24 | 下载 | Energy is a fundamental component of modern life, driving nearly all aspects of daily activities. As such, the inability to access energy when needed is a significant issue that requires innovative so... |
| Jenga: Effective Memory Management for Serving LLM with Heterogeneity | Chen Zhang, Kuntai Du, Shu Liu, Woosuk Kwon, Xiangxi Mo, Yufeng Wang, Xiaoxuan Liu, Kaichao You, Zhuohan Li, Mingsheng Long, Jidong Zhai, Joseph Gonzalez, Ion Stoica | 2025-03-24 | 下载 | Large language models (LLMs) are widely used but expensive to run, especially as inference workloads grow. To lower costs, maximizing the request batch size by managing GPU memory efficiently is cruci... |
| Risk Management for Distributed Arbitrage Systems: Integrating Artificial Intelligence | Akaash Vishal Hazarika, Mahak Shah, Swapnil Patil, Pradyumna Shukla | 2025-03-24 | 下载 | Effective risk management solutions become absolutely crucial when financial markets embrace distributed technology and decentralized financing (DeFi). |
| Bridging Emotions and Architecture: Sentiment Analysis in Modern Distributed Systems | Mahak Shah, Akaash Vishal Hazarika, Meetu Malhotra, Sachin C. Patil, Joshit Mohanty | 2025-03-24 | 下载 | Sentiment analysis is a field within NLP that has gained importance because it is applied in various areas such as; social media surveillance, customer feedback evaluation and market research. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Rank-Based Modeling for Universal Packets Compression in Multi-Modal Communications | Xuanhao Luo, Zhiyuan Peng, Zhouyu Li, Ruozhou Yu, Yuchen Liu | 2025-03-24 | 下载 | The rapid increase in networked systems and data transmission requires advanced data compression solutions to optimize bandwidth utilization and enhance network performance. |
| Enhancing V2X Communications with UAV-mounted Reconfigurable Intelligent Surfaces | Salim Janji, Paweł Sroka, Adrian Kliks | 2025-03-24 | 下载 | This paper addresses the crucial need for reliable wireless communication in vehicular networks, particularly vital for the safety and efficacy of (semi-)autonomous driving amid increasing traffic. |
| Signal Propagation in RIS-Aided 5G Systems | Adam Samorzewski, Adrian Kliks | 2025-03-24 | 下载 | In this paper, we conduct an in-depth analysis of radio signal propagation characteristics within the urban environment of Poznan (Poland). The study specifically addresses the deployment of a 5th gen... |
| Energy-Efficient Dynamic Training and Inference for GNN-Based Network Modeling | Chetna Singhal, Yassine Hadjadj-Aoul | 2025-03-24 | 下载 | Efficient network modeling is essential for resource optimization and network planning in next-generation large-scale complex networks. Traditional approaches, such as queuing theory-based modeling an... |
| Periodic Chains Scheduling on Dedicated Resources -- A Crucial Problem in Time-Sensitive Networks | Josef Grus, Claire Hanen, Zdeněk Hanzálek | 2025-03-24 | 下载 | Periodic messages transfer data from sensors to actuators in cars, planes, and complex production machines. When considering a given routing, the unicast message starts at its source and goes over sev... |
| Real-Time Streaming Telemetry Based Detection and Mitigation of OOK and Power Interference in Multi-User OSaaS Networks | Agastya Raj, Devika Dass, Daniel C. Kilper, Marco Ruffini | 2025-03-24 | 下载 | We present a framework to identify and mitigate rogue OOK signals and user-generated power interference in a multi-user Optical-Spectrum-as-a-Service network. |
| Large Language Models powered Malicious Traffic Detection: Architecture, Opportunities and Case Study | Xinggong Zhang, Haotian Meng, Qingyang Li, Yunpeng Tan, Lei Zhang | 2025-03-24 | 下载 | Malicious traffic detection is a pivotal technology for network security to identify abnormal network traffic and detect network attacks. Large Language Models (LLMs) are trained on a vast corpus of t... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache | Dayou Du, Shijie Cao, Jianyi Cheng, Luo Mai, Ting Cao, Mao Yang | 2025-03-24 | 下载 | The growth of long-context Large Language Models (LLMs) significantly increases memory and bandwidth pressure during autoregressive decoding due to the expanding Key-Value (KV) cache. |
| cfdSCOPE: A Fluid-Dynamics Proxy App for Teaching Performance Engineering | Peter Arzt, Sebastian Kreutzer, Tim Jammer, Christian Bischof | 2025-03-24 | 下载 | Teaching performance engineering in high-performance computing (HPC) requires example codes that demonstrate bottlenecks and enable hands-on optimization. |