2026-04-11

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
A 129FPS Full HD Real-Time Accelerator for 3D Gaussian Splatting	Fang-Chi Chang, Tian-Sheuan Chang	2026-04-11	下载	Rendering large-scale, unbounded scenes on AR/VR-class devices is constrained by the computation, bandwidth, and storage cost of 3D Gaussian Splatting (3DGS).
WaveTune: Wave-aware Bilinear Modeling for Efficient GPU Kernel Auto-tuning	Kaixuan Zhang, Chutong Ding, Shiyou Qian, Luping Wang, Jian Cao, Guangtao Xue, Cheng Huang, Guodong Yang, Liping Zhang	2026-04-11	下载	The rapid adoption of Large Language Models (LLMs) has made GPU inference efficiency an increasingly critical system concern. The runtime of LLM workloads is largely dominated by tile-based kernels, p...
FlexVector: A SpMM Vector Processor with Flexible VRF for GCNs on Varying-Sparsity Graphs	Bohan Li, Shengmin Li, Xinyu Shi, Enyi Yao, Francky Catthoor, Simei Yang	2026-04-11	下载	Graph Convolutional Networks (GCNs) are widely adopted for tasks involving relational or graph-structured data and can be formulated as two-stage sparse-dense matrix multiplication (SpMM) during infer...
Late Breaking Results: CHESSY: Coupled Hybrid Emulation with SystemC-FPGA Synchronization	Lorenzo Ruotolo, Giovanni Pollo, Mohamed Amine Hamdi, Matteo Risso, Yukai Chen, Enrico Macii, Massimo Poncino, Sara Vinco, Alessio Burrello, Daniele Jahier Pagliari	2026-04-11	下载	The growing complexity of cyber-physical systems (CPSs) calls for early prototyping tools that combine accuracy, speed, and usability. Virtual Platforms (VPs) provide fast functional simulation, but h...
Aging Aware Adaptive Voltage Scaling for Reliable and Efficient AI Accelerators	Tong Xie, Zuodong Zhang, Chao Yang, Yuan Wang, Runsheng Wang, Meng Li	2026-04-11	下载	Deep neural networks (DNNs) have showcased remarkable performance across various tasks and are widely deployed on AI accelerators fabricated in advanced technology nodes for efficiency.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Icicle: Scalable Metadata Indexing and Real-Time Monitoring for HPC File Systems	Haochen Pan, Ryan Chard, Song Young Oh, Maxime Gonthier, Valérie Hayot-Sasson, Geoffrey Lentner, Joe Bottigliero, Rachana Ananthakrishnan, Kyle Chard, Ian Foster	2026-04-11	下载	Modern HPC file systems can contain billions of files and hundreds of petabytes of data, making even simple questions increasingly intractable to answer.
Verifying In-Network Computing Systems for Design Risks	Tianyu Bai, Ying Zhang, Ying Zhang, Wenfei Wu	2026-04-11	下载	The emergence of programmable switches has brought in-network computing (INC) into the spotlight in recent years. By offloading computation directly onto the data transmission process, INC improves ne...
RF-LEGO: Modularized Signal Processing-Deep Learning Co-Design for RF Sensing via Deep Unrolling	Luca Jiang-Tao Yu, Chenshu Wu	2026-04-11	下载	Wireless sensing, traditionally relying on signal processing (SP) techniques, has recently shifted toward data-driven deep learning (DL) to achieve performance breakthroughs.
Tessera: Unlocking Heterogeneous GPUs through Kernel-Granularity Disaggregation	Tiancheng Hu, Jin Qin, Zheng Wang, Junhao Hu, Yuzheng Wang, Lei Chen, Yizhou Shan, Mingxing Zhang, Ting Cao, Chunwei Xia, Huimin Cui, Tao Xie, Chenxi Wang	2026-04-11	下载	Disaggregation maps parts of an AI workload to different types of GPUs, offering a path to utilize modern heterogeneous GPU clusters. However, existing solutions operate at a coarse granularity and ar...
FlexVector: A SpMM Vector Processor with Flexible VRF for GCNs on Varying-Sparsity Graphs	Bohan Li, Shengmin Li, Xinyu Shi, Enyi Yao, Francky Catthoor, Simei Yang	2026-04-11	下载	Graph Convolutional Networks (GCNs) are widely adopted for tasks involving relational or graph-structured data and can be formulated as two-stage sparse-dense matrix multiplication (SpMM) during infer...
LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication	Wei Liu, Anweshit Panda, Ujwal Pandey, Haven Cook, George M. Slota, Naigang Wang, Jie Chen, Yangyang Xu	2026-04-11	下载	In the decentralized distributed learning, achieving fast convergence and low communication cost is essential for scalability and high efficiency.
Rebooting Microreboot: Architectural Support for Safe, Parallel Recovery in Microservice Systems	Laurent Bindschaedler	2026-04-11	下载	Microreboot enables fast recovery by restarting only the failing component, but in modern microservices naive restarts are unsafe: dense dependencies mean rebooting one service can disrupt many caller...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Impact of Intelligent Technologies on IoV Security: Integrating Edge Computing and AI	Awais Bilal, Kashif Sharif, Liehuang Zhu, Chang Xu, Fan Li, Sadaf Bukhari, Sujit Biswas	2026-04-11	下载	The rapid development and integration of intelligent technologies in the Internet of Vehicles (IoV) have revolutionized transportation systems by enhancing connectivity, automation, and safety.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
ClawVM: Harness-Managed Virtual Memory for Stateful Tool-Using LLM Agents	Mofasshara Rafique, Laurent Bindschaedler	2026-04-11	下载	Stateful tool-using LLM agents treat the context window as working memory, yet today's agent harnesses manage residency and durability as best-effort, causing recurring failures: lost state after comp...