2025-12-18

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Full System Architecture Modeling for Wearable Egocentric Contextual AI	Vincent T. Lee, Tanfer Alan, Sung Kim, Ecenur Ustun, Amr Suleiman, Ajit Krisshna, Tim Balbekov, Armin Alaghi, Richard Newcombe	2025-12-18	下载	The next generation of human-oriented computing will require always-on, spatially-aware wearable devices to capture egocentric vision and functional primitives (e.g.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Taming the Memory Footprint Crisis: System Design for Production Diffusion LLM Serving	Jiakun Fan, Yanglin Zhang, Xiangchen Li, Dimitrios S. Nikolopoulos	2025-12-18	下载	Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to Autoregressive Models (ARMs), utilizing parallel decoding to overcome sequential bottlenecks.
BitFlipScope: Scalable Fault Localization and Recovery for Bit-Flip Corruptions in LLMs	Muhammad Zeeshan Karamat, Sadman Saif, Christiana Chamon Garcia	2025-12-18	下载	Large Language Models (LLMs) deployed in practical and safety-critical settings are increasingly susceptible to bit-flip faults caused by hardware degradation, cosmic radiation, or deliberate fault-in...
Sedna: Sharding transactions in multiple concurrent proposer blockchains	Alejandro Ranchal-Pedrosa, Benjamin Marsh, Lefteris Kokoris-Kogias, Alberto Sonnino	2025-12-18	下载	Modern blockchains increasingly adopt multi-proposer (MCP) consensus to remove single-leader bottlenecks and improve censorship resistance. However, MCP alone does not resolve how users should dissemi...
LLM-HPC++: Evaluating LLM-Generated Modern C++ and MPI+OpenMP Codes for Scalable Mandelbrot Set Computation	Patrick Diehl, Noujoud Nader, Deepti Gupta	2025-12-18	下载	Parallel programming remains one of the most challenging aspects of High-Performance Computing (HPC), requiring deep knowledge of synchronization, communication, and memory models.
Training Together, Diagnosing Better: Federated Learning for Collagen VI-Related Dystrophies	Astrid Brull, Sara Aguti, Véronique Bolduc, Ying Hu, Daniel M. Jimenez-Gutierrez, Enrique Zuazua, Joaquin Del-Rio, Oleksii Sliusarenko, Haiyan Zhou, Francesco Muntoni, Carsten G. Bönnemann, Xabi Uribe-Etxebarria	2025-12-18	下载	The application of Machine Learning (ML) to the diagnosis of rare diseases, such as collagen VI-related dystrophies (COL6-RD), is fundamentally limited by the scarcity and fragmentation of available d...
Coordinated Anti-Jamming Resilience in Swarm Networks via Multi-Agent Reinforcement Learning	Bahman Abolhassani, Tugba Erpek, Kemal Davaslioglu, Yalin E. Sagduyu, Sastry Kompella	2025-12-18	下载	Reactive jammers pose a severe security threat to robotic-swarm networks by selectively disrupting inter-agent communications and undermining formation integrity and mission success.
Delay-Aware Multi-Stage Edge Server Upgrade with Budget Constraint	Endar Suprih Wihidayat, Sieteng Soh, Kwan-Wu Chin, Duc-son Pham	2025-12-18	下载	In this paper, the Multi-stage Edge Server Upgrade (M-ESU) is proposed as a new network planning problem, involving the upgrading of an existing multi-access edge computing (MEC) system through multip...
Efficient Bitcoin Meta-Protocol Transaction and Data Discovery Through nLockTime Field Repurposing	Nikodem Tomczak	2025-12-18	下载	We describe the Lockchain Protocol, a lightweight Bitcoin meta-protocol that enables highly efficient transaction discovery at zero marginal block space cost, and data verification without introducing...
Efficient CPU-GPU Collaborative Inference for MoE-based LLMs on Memory-Limited Systems	En-Ming Huang, Li-Shang Lin, Chun-Yi Lee	2025-12-18	下载	Large Language Models (LLMs) have achieved impressive results across various tasks, yet their high computational demands pose deployment challenges, especially on consumer-grade hardware.
AI4EOSC: a Federated Cloud Platform for Artificial Intelligence in Scientific Research	Ignacio Heredia, Álvaro López García, Fernando Aguilar Gómez, Diego Aguirre, Caterina Alarcón Marín, Khadijeh Alibabaei, Lisana Berberi, Miguel Caballer, Amanda Calatrava, Pedro Castro, Alessandro Costantini, Mario David, Jaime Díez Stefan Dlugolinsky, Borja Esteban Sanchis, Giacinto Donvito, Leonhard Duda, Saúl Fernandez, Andrés Heredia Canales, Valentin Kozlov, Sergio Langarita, João Machado, Germán Moltó, Daniel San Martín, Martin Šeleng, Giang Nguyen, Marcin Płóciennik, Marta Obregón Ruiz, Susana Rebolledo Ruiz, Vicente Rodriguez, Judith Sáinz-Pardo Díaz, Viet Tran	2025-12-18	下载	In this paper, we describe a federated compute platform dedicated to support Artificial Intelligence in scientific workloads. Putting the effort into reproducible deployments, it delivers consistent, ...
Kascade: A Practical Sparse Attention Method for Long-Context LLM Inference	Dhruv Deshmukh, Saurabh Goyal, Nipun Kwatra, Ramachandran Ramjee	2025-12-18	下载	Attention is the dominant source of latency during long-context LLM inference, an increasingly popular workload with reasoning models and RAG.
AiiDAlab: on the route to accelerate science	Aliaksandr V. Yakutovich, Jusong Yu, Daniel Hollas, Edan Bainglass, Corsin Battaglia, Miki Bonacci, Lucas Fernandez Vilanova, Stephan Henne, Anders Kaestner, Michel Kenzelmann, Graham Kimbell, Jakob Lass, Fabio Lopes, Daniel G. Mazzone, Andres Ortega-Guerrero, Xing Wang, Nicola Marzari, Carlo A. Pignedoli, Giovanni Pizzi	2025-12-18	下载	With the availability of ever-increasing computational capabilities, robust and automated research workflows are essential to enable and facilitate the execution and orchestration of large numbers of ...
FlexKV: Flexible Index Offloading for Memory-Disaggregated Key-Value Store	Zhisheng Hu, Jiacheng Shen, Ming-Chang Yang	2025-12-18	下载	Disaggregated memory (DM) is a promising data center architecture that decouples CPU and memory into independent resource pools to improve resource utilization.
Analysis of Design Patterns and Benchmark Practices in Apache Kafka Event-Streaming Systems	Muzeeb Mohammad	2025-12-18	下载	Apache Kafka has become a foundational platform for high throughput event streaming, enabling real time analytics, financial transaction processing, industrial telemetry, and large scale data driven s...
Lotus: Optimizing Disaggregated Transactions with Disaggregated Locks	Zhisheng Hu, Pengfei Zuo, Junliang Hu, Yizou Chen, Yingjia Wang, Ming-Chang Yang	2025-12-18	下载	Disaggregated memory (DM) separates compute and memory resources, allowing flexible scaling to achieve high resource utilization. To ensure atomic and consistent data access on DM, distributed transac...
Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference	Jian Tian, Shuailong Li, Yang Cao, Wenbo Cui, Minghan Zhu, Wenkang Wu, Jianming Zhang, Yanpeng Wang, Zhiwen Xiao, Zhenyu Hou, Dou Shen	2025-12-18	下载	The evolution of Large Language Model (LLM) serving towards complex, distributed architectures--specifically the P/D-separated, large-scale DP+EP paradigm--introduces distinct scheduling challenges.
An Online Fragmentation-Aware Scheduler for Managing GPU-Sharing Workloads on Multi-Instance GPUs	Hsu-Tzu Ting, Jerry Chou, Ming-Hung Chen, I-Hsin Chung	2025-12-18	下载	Modern GPU workloads increasingly demand efficient resource sharing, as many jobs do not require the full capacity of a GPU. Among sharing techniques, NVIDIA's Multi-Instance GPU (MIG) offers strong r...
Cold-Start Anti-Patterns and Refactorings in Serverless Systems: An Empirical Study	Syed Salauddin Mohammad Tariq, Foyzul Hassan, Amiangshu Bosu, Probir Roy	2025-12-18	下载	Serverless computing simplifies deployment and scaling, yet cold-start latency remains a major performance bottleneck. Unlike prior work that treats mitigation as a black-box optimization, we study co...
Twinning for Space-Air-Ground-Sea Integrated Networks: Beyond Conventional Digital Twin Towards Goal-Oriented Semantic Twin	Yifei Qiu, Tianle Liao, Xin Jin, Qinyu Zhang, Shaohua Wu	2025-12-18	下载	A space-air-ground-sea integrated network (SAGSIN) has emerged as a cornerstone of 6G systems, establishing a unified global architecture by integrating multi-domain network resources.
MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM Services	Lingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen	2025-12-18	下载	The limited bandwidth of PCIe has emerged as the critical bottleneck for large language model (LLM) performance, such as prefix cache fetching and model switching.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
How to Discover Knowledge for FutureG: Contextual RAG and LLM Prompting for O-RAN	Nathan Conger, Nathan Scollar, Kemal Davaslioglu, Yalin E. Sagduyu, Sastry Kompella	2025-12-18	下载	We present a retrieval-augmented question answering framework for 5G/6G networks, where the Open Radio Access Network (O-RAN) has become central to disaggregated, virtualized, and AI-driven wireless s...
Coordinated Anti-Jamming Resilience in Swarm Networks via Multi-Agent Reinforcement Learning	Bahman Abolhassani, Tugba Erpek, Kemal Davaslioglu, Yalin E. Sagduyu, Sastry Kompella	2025-12-18	下载	Reactive jammers pose a severe security threat to robotic-swarm networks by selectively disrupting inter-agent communications and undermining formation integrity and mission success.
Acoustic RIS for Massive Spatial Multiplexing: Unleashing Degrees of Freedom and Capacity in Underwater Communications	Longfei Zhao, Jingbo Tan, Jintao Wang, Ian F Akyildiz, Zhi Sun	2025-12-18	下载	Underwater acoustic (UWA) communications are essential for high-speed marine data transmission but remain severely constrained by limited bandwidth, significant propagation loss, and sparse multipath ...
A Network Arena for Benchmarking AI Agents on Network Troubleshooting	Zhihao Wang, Alessandro Cornacchia, Alessio Sacco, Franco Galante, Marco Canini, Dingde Jiang	2025-12-18	下载	Agentic systems, powered by Large Language Models (LLMs), assist network engineers with network configuration synthesis and network troubleshooting tasks.
Privacy-Aware Sharing of Raw Spatial Sensor Data for Cooperative Perception	Bangya Liu, Chengpo Yan, Chenghao Jiang, Suman Banerjee, Akarsh Prabhakara	2025-12-18	下载	Cooperative perception between vehicles is poised to offer robust and reliable scene understanding. Recently, we are witnessing experimental systems research building testbeds that share raw spatial s...
MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM Services	Lingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen	2025-12-18	下载	The limited bandwidth of PCIe has emerged as the critical bottleneck for large language model (LLM) performance, such as prefix cache fetching and model switching.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Trustworthy and Controllable Professional Knowledge Utilization in Large Language Models with TEE-GPU Execution	Yifeng Cai, Zhida An, Yuhan Meng, Houqian Liu, Pengli Wang, Hanwen Lei, Yao Guo, Ding Li	2025-12-18	下载	Future improvements in large language model (LLM) services increasingly hinge on access to high-value professional knowledge rather than more generic web data.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
An Upper Bound on the M/M/k Queue With Deterministic Setup Times	Jalani Williams, Weina Wang, Mor Harchol-Balter	2025-12-18	下载	In many systems, servers do not turn on instantly; instead, a setup time must pass before a server can begin work. These "setup times" can wreak havoc on a system's queueing; this is especially true i...
XTC, A Research Platform for Optimizing AI Workload Operators	Pompougnac Hugo, Guillon Christophe, Noiry Sylvain, Dutilleul Alban, Iooss Guillaume, Rastello Fabrice	2025-12-18	下载	Achieving high efficiency on AI operators demands precise control over computation and data movement. However, existing scheduling languages are locked into specific compiler ecosystems, preventing fa...
MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM Services	Lingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen	2025-12-18	下载	The limited bandwidth of PCIe has emerged as the critical bottleneck for large language model (LLM) performance, such as prefix cache fetching and model switching.

2025-12-18 ​

cs.AR - Architecture ​

cs.DC - Distributed, Parallel, and Cluster Computing ​

cs.NI - Networking and Internet Architecture ​

cs.OS - Operating Systems ​

cs.PF - Performance ​

2025-12-18

cs.AR - Architecture

cs.DC - Distributed, Parallel, and Cluster Computing

cs.NI - Networking and Internet Architecture

cs.OS - Operating Systems

cs.PF - Performance