Appearance
2026-03-18
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| PAI: Fast, Accurate, and Full Benchmark Performance Projection with AI | Avery Johnson, Mohammad Majharul Islam, Riad Akram, Abdullah Muzahid | 2026-03-18 | 下载 | The exponential increase in complex IPs within modern SoCs, driven by Moore's Law, has created a pressing need for fast and accurate hardware-software power-performance analysis. |
| A Survey of Neural Network Variational Monte Carlo from a Computing Workload Characterization Perspective | Zhengze Xiao, Xuanzhe Ding, Yuyang Lou, Lixue Cheng, Chaojian Li | 2026-03-18 | 下载 | Neural Network Variational Monte Carlo (NNVMC) has emerged as a promising paradigm for solving quantum many-body problems by combining variational Monte Carlo with expressive neural-network wave-funct... |
| Enabling RISC-V Vector Code Generation in MLIR through Custom xDSL Lowerings | Jie Lei, Héctor Martínez, Adrián Castelló | 2026-03-18 | 下载 | The growing adoption of RISC-V in high-performance and scientific computing has increased the need for performance-portable code targeting the RISC-V Vector (RVV) extension. |
| HWE-Bench: Can Language Models Perform Board-level Schematic Designs? | Weibo Qiu, Yinhao Xiao, Runyu Pan | 2026-03-18 | 下载 | Large Language Models (LLMs) have demonstrated significant potential in various engineering tasks, including software development, digital logic generation, and companion document maintenance. |
| The Program Hypergraph: Multi-Way Relational Structure for Geometric Algebra, Spatial Compute, and Physics-Aware Compilation | Houston Haynes | 2026-03-18 | 下载 | The Program Semantic Graph (PSG) introduced in prior work on Dimensional Type Systems and Deterministic Memory Management encodes compilation-relevant properties as binary edge relations between compu... |
| ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression | Ruibo Fan, Xiangrui Yu, Xinglin Pan, Zeyu Li, Weile Luo, Qiang Wang, Wei Wang, Xiaowen Chu | 2026-03-18 | 下载 | Lossless model compression holds tremendous promise for alleviating the memory and bandwidth bottlenecks in bit-exact Large Language Model (LLM) serving. |
| ReLMXEL: Adaptive RL-Based Memory Controller with Explainable Energy and Latency Optimization | Panuganti Chirag Sai, Gandholi Sarat, R. Raghunatha Sarma, Venkata Kalyan Tavva, Naveen M | 2026-03-18 | 下载 | Reducing latency and energy consumption is critical to improving the efficiency of memory systems in modern computing. This work introduces ReLMXEL (Reinforcement Learning for Memory Controller with E... |
| A Synthesizable RTL Implementation of Predictive Coding Networks | Timothy Oh | 2026-03-18 | 下载 | Backpropagation has enabled modern deep learning but is difficult to realize as an online, fully distributed hardware learning system due to global error propagation, phase separation, and heavy relia... |
| KANtize: Exploring Low-bit Quantization of Kolmogorov-Arnold Networks for Efficient Inference | Sohaib Errabii, Olivier Sentieys, Marcello Traiola | 2026-03-18 | 下载 | Kolmogorov-Arnold Networks (KANs) have gained attention for their potential to outperform Multi-Layer Perceptrons (MLPs) in terms of parameter efficiency and interpretability. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Adaptive Domain Models: Bayesian Evolution, Warm Rotation, and Principled Training for Geometric and Neuromorphic AI | Houston Haynes | 2026-03-18 | 下载 | Prevailing AI training infrastructure assumes reverse-mode automatic differentiation over IEEE-754 arithmetic. The memory overhead of training relative to inference, optimizer complexity, and structur... |
| A mechanism design overview of Sedna | Benjamin Marsh, Alejandro Ranchal-Pedrosa | 2026-03-18 | 下载 | Sedna is a coded multi-proposer consensus protocol in which a sender shards a transaction payload into rateless symbols and disseminates them across parallel proposer lanes, providing high throughput ... |
| Multi-stage Flow Scheduling for LLM Serving | Yijun Sun, Xudong Liao, Songrun Xie, Hao Chen, Han Tian, Wenxue Li, Yiming Zhang, Kai Chen | 2026-03-18 | 下载 | Meeting stringent Time-To-First-Token (TTFT) requirements is crucial for LLM applications. To improve efficiency, modern LLM serving systems adopt disaggregated architectures with diverse parallelisms... |
| ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression | Ruibo Fan, Xiangrui Yu, Xinglin Pan, Zeyu Li, Weile Luo, Qiang Wang, Wei Wang, Xiaowen Chu | 2026-03-18 | 下载 | Lossless model compression holds tremendous promise for alleviating the memory and bandwidth bottlenecks in bit-exact Large Language Model (LLM) serving. |
| The 1/W Law: An Analytical Study of Context-Length Routing Topology and GPU Generation Gains for LLM Inference Energy Efficiency | Huamin Chen, Xunzhuo Liu, Yuhan Liu, Junchen Jiang, Bowei He, Xue Liu | 2026-03-18 | 下载 | How many tokens can a GPU inference cluster deliver per watt? Across deployments of identical hardware, the answer varies by 40x -- not because of software inefficiency, but because of the serving con... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Offload or Overload: A Platform Measurement Study of Mobile Robotic Manipulation Workloads | Sara Pohland, Xenofon Foukas, Ganesh Ananthanarayanan, Andrey Kolobov, Sanjeev Mehrotra, Bozidar Radunovic, Ankit Verma | 2026-03-18 | 下载 | Mobile robotic manipulation--the ability of robots to navigate spaces and interact with objects--is a core capability of physical AI. Foundation models have led to breakthroughs in their performance, ... |
| RIS-Aided Mobile Network Design | Adam Samorzewski, Adrian Kliks | 2026-03-18 | 下载 | In this paper, we examine the distribution of radio signal propagation within the city of Poznan (Poland) to determine optimal locations for deploying Reconfigurable Intelligent Surfaces (RIS). |
| Access Controlled Website Interaction for Agentic AI with Delegated Critical Tasks | Sunyoung Kim, Hokeun Kim | 2026-03-18 | 下载 | Recent studies reveal gaps in delegating critical tasks to agentic AI that accesses websites on the user's behalf, primarily due to limited access control mechanisms on websites designed for agentic A... |
| Enabling Real-Time Programmability for RAN Functions: A Wasm-Based Approach for Robust and High-Performance dApps | João Paulo Esper, Yure Freitas, Pedro Souza, Bruno Silvestre, Joao F. Santos, Alexandre Huff, Cristiano Both, Kleber Cardoso | 2026-03-18 | 下载 | While the Open Radio Access Network Alliance (O-RAN) architecture enables third-party applications to optimize radio access networks at multiple timescales, real-time distributed applications (dApps) ... |
| A Vision-based Framework for Intelligent gNodeB Mobility Control | Pedro Duarte, André Coelho, Francisco Ribeiro, Filipe B. Teixeira, Luís Pessoa, Manuel Ricardo | 2026-03-18 | 下载 | This paper proposes a vision-based framework for the intelligent control of mobile Open Radio Access Network (O-RAN) base stations (gNBs) operating in dynamic wireless environments. |
| Bringing Network Coding into Multi-Robot Systems: Interplay Study for Autonomous Systems over Wireless Communications | Anil Zaher, Kiril Solovey, Alejandro Cohen | 2026-03-18 | 下载 | Communication is a core enabler for multi-robot systems (MRS), providing the mechanism through which robots exchange state information, coordinate actions, and satisfy safety constraints. |
| Multi-stage Flow Scheduling for LLM Serving | Yijun Sun, Xudong Liao, Songrun Xie, Hao Chen, Han Tian, Wenxue Li, Yiming Zhang, Kai Chen | 2026-03-18 | 下载 | Meeting stringent Time-To-First-Token (TTFT) requirements is crucial for LLM applications. To improve efficiency, modern LLM serving systems adopt disaggregated architectures with diverse parallelisms... |
| IEMAS: An Incentive-Efficiency Routing Framework for Open Agentic Web Ecosystems | Hongze Liu, Chang Guo, Yingzeng Li, Mengru Wang, Jiong Lou, Shijing Yuan, Hefeng Zhou, Chentao Wu, Jie LI | 2026-03-18 | 下载 | The transition to open, distributed Multi-Agent Systems (MAS) promises scalable intelligence but introduces a non-trivial tension: maximizing global efficiency requires cooperative, resource-aware sch... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| AppFlow: Memory Scheduling for Cold Launch of Large Apps on Mobile and Vehicle Systems | Xiaochen Li, Sicong Liu, Bin Guo, Yu Ouyang, Fengmin Wu, Yuan Xu, Zhiwen Yu | 2026-03-18 | 下载 | GB-scale large apps like on-device LLMs and rich media editors are becoming the next-generation trend, but their heavy memory and I/O demands, especially during multitasking, cause devices to reclaim ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Swarm: Co-Activation Aware KVCache Offloading Across Multiple SSDs | Tuowei Wang, Liyun Chu, Ruwen Fan, Ju Ren | 2026-03-18 | 下载 | The key-value (KV) cache has become the dominant contributor to memory consumption in large language model (LLM) inference. Although offloading KVCache from GPU high-bandwidth memory (HBM) to CPU DRAM... |
| ZipServ: Fast and Memory-Efficient LLM Inference with Hardware-Aware Lossless Compression | Ruibo Fan, Xiangrui Yu, Xinglin Pan, Zeyu Li, Weile Luo, Qiang Wang, Wei Wang, Xiaowen Chu | 2026-03-18 | 下载 | Lossless model compression holds tremendous promise for alleviating the memory and bandwidth bottlenecks in bit-exact Large Language Model (LLM) serving. |