Skip to content

2025-12-18

cs.AR - Architecture

标题作者发布日期PDF摘要
Full System Architecture Modeling for Wearable Egocentric Contextual AIVincent T. Lee, Tanfer Alan, Sung Kim, Ecenur Ustun, Amr Suleiman, Ajit Krisshna, Tim Balbekov, Armin Alaghi, Richard Newcombe2025-12-18下载The next generation of human-oriented computing will require always-on, spatially-aware wearable devices to capture egocentric vision and functional primitives (e.g.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Taming the Memory Footprint Crisis: System Design for Production Diffusion LLM ServingJiakun Fan, Yanglin Zhang, Xiangchen Li, Dimitrios S. Nikolopoulos2025-12-18下载Diffusion Large Language Models (dLLMs) have emerged as a promising alternative to Autoregressive Models (ARMs), utilizing parallel decoding to overcome sequential bottlenecks.
BitFlipScope: Scalable Fault Localization and Recovery for Bit-Flip Corruptions in LLMsMuhammad Zeeshan Karamat, Sadman Saif, Christiana Chamon Garcia2025-12-18下载Large Language Models (LLMs) deployed in practical and safety-critical settings are increasingly susceptible to bit-flip faults caused by hardware degradation, cosmic radiation, or deliberate fault-in...
Sedna: Sharding transactions in multiple concurrent proposer blockchainsAlejandro Ranchal-Pedrosa, Benjamin Marsh, Lefteris Kokoris-Kogias, Alberto Sonnino2025-12-18下载Modern blockchains increasingly adopt multi-proposer (MCP) consensus to remove single-leader bottlenecks and improve censorship resistance. However, MCP alone does not resolve how users should dissemi...
LLM-HPC++: Evaluating LLM-Generated Modern C++ and MPI+OpenMP Codes for Scalable Mandelbrot Set ComputationPatrick Diehl, Noujoud Nader, Deepti Gupta2025-12-18下载Parallel programming remains one of the most challenging aspects of High-Performance Computing (HPC), requiring deep knowledge of synchronization, communication, and memory models.
Training Together, Diagnosing Better: Federated Learning for Collagen VI-Related DystrophiesAstrid Brull, Sara Aguti, Véronique Bolduc, Ying Hu, Daniel M. Jimenez-Gutierrez, Enrique Zuazua, Joaquin Del-Rio, Oleksii Sliusarenko, Haiyan Zhou, Francesco Muntoni, Carsten G. Bönnemann, Xabi Uribe-Etxebarria2025-12-18下载The application of Machine Learning (ML) to the diagnosis of rare diseases, such as collagen VI-related dystrophies (COL6-RD), is fundamentally limited by the scarcity and fragmentation of available d...
Coordinated Anti-Jamming Resilience in Swarm Networks via Multi-Agent Reinforcement LearningBahman Abolhassani, Tugba Erpek, Kemal Davaslioglu, Yalin E. Sagduyu, Sastry Kompella2025-12-18下载Reactive jammers pose a severe security threat to robotic-swarm networks by selectively disrupting inter-agent communications and undermining formation integrity and mission success.
Delay-Aware Multi-Stage Edge Server Upgrade with Budget ConstraintEndar Suprih Wihidayat, Sieteng Soh, Kwan-Wu Chin, Duc-son Pham2025-12-18下载In this paper, the Multi-stage Edge Server Upgrade (M-ESU) is proposed as a new network planning problem, involving the upgrading of an existing multi-access edge computing (MEC) system through multip...
Efficient Bitcoin Meta-Protocol Transaction and Data Discovery Through nLockTime Field RepurposingNikodem Tomczak2025-12-18下载We describe the Lockchain Protocol, a lightweight Bitcoin meta-protocol that enables highly efficient transaction discovery at zero marginal block space cost, and data verification without introducing...
Efficient CPU-GPU Collaborative Inference for MoE-based LLMs on Memory-Limited SystemsEn-Ming Huang, Li-Shang Lin, Chun-Yi Lee2025-12-18下载Large Language Models (LLMs) have achieved impressive results across various tasks, yet their high computational demands pose deployment challenges, especially on consumer-grade hardware.
AI4EOSC: a Federated Cloud Platform for Artificial Intelligence in Scientific ResearchIgnacio Heredia, Álvaro López García, Fernando Aguilar Gómez, Diego Aguirre, Caterina Alarcón Marín, Khadijeh Alibabaei, Lisana Berberi, Miguel Caballer, Amanda Calatrava, Pedro Castro, Alessandro Costantini, Mario David, Jaime Díez Stefan Dlugolinsky, Borja Esteban Sanchis, Giacinto Donvito, Leonhard Duda, Saúl Fernandez, Andrés Heredia Canales, Valentin Kozlov, Sergio Langarita, João Machado, Germán Moltó, Daniel San Martín, Martin Šeleng, Giang Nguyen, Marcin Płóciennik, Marta Obregón Ruiz, Susana Rebolledo Ruiz, Vicente Rodriguez, Judith Sáinz-Pardo Díaz, Viet Tran2025-12-18下载In this paper, we describe a federated compute platform dedicated to support Artificial Intelligence in scientific workloads. Putting the effort into reproducible deployments, it delivers consistent, ...
Kascade: A Practical Sparse Attention Method for Long-Context LLM InferenceDhruv Deshmukh, Saurabh Goyal, Nipun Kwatra, Ramachandran Ramjee2025-12-18下载Attention is the dominant source of latency during long-context LLM inference, an increasingly popular workload with reasoning models and RAG.
AiiDAlab: on the route to accelerate scienceAliaksandr V. Yakutovich, Jusong Yu, Daniel Hollas, Edan Bainglass, Corsin Battaglia, Miki Bonacci, Lucas Fernandez Vilanova, Stephan Henne, Anders Kaestner, Michel Kenzelmann, Graham Kimbell, Jakob Lass, Fabio Lopes, Daniel G. Mazzone, Andres Ortega-Guerrero, Xing Wang, Nicola Marzari, Carlo A. Pignedoli, Giovanni Pizzi2025-12-18下载With the availability of ever-increasing computational capabilities, robust and automated research workflows are essential to enable and facilitate the execution and orchestration of large numbers of ...
FlexKV: Flexible Index Offloading for Memory-Disaggregated Key-Value StoreZhisheng Hu, Jiacheng Shen, Ming-Chang Yang2025-12-18下载Disaggregated memory (DM) is a promising data center architecture that decouples CPU and memory into independent resource pools to improve resource utilization.
Analysis of Design Patterns and Benchmark Practices in Apache Kafka Event-Streaming SystemsMuzeeb Mohammad2025-12-18下载Apache Kafka has become a foundational platform for high throughput event streaming, enabling real time analytics, financial transaction processing, industrial telemetry, and large scale data driven s...
Lotus: Optimizing Disaggregated Transactions with Disaggregated LocksZhisheng Hu, Pengfei Zuo, Junliang Hu, Yizou Chen, Yingjia Wang, Ming-Chang Yang2025-12-18下载Disaggregated memory (DM) separates compute and memory resources, allowing flexible scaling to achieve high resource utilization. To ensure atomic and consistent data access on DM, distributed transac...
Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM InferenceJian Tian, Shuailong Li, Yang Cao, Wenbo Cui, Minghan Zhu, Wenkang Wu, Jianming Zhang, Yanpeng Wang, Zhiwen Xiao, Zhenyu Hou, Dou Shen2025-12-18下载The evolution of Large Language Model (LLM) serving towards complex, distributed architectures--specifically the P/D-separated, large-scale DP+EP paradigm--introduces distinct scheduling challenges.
An Online Fragmentation-Aware Scheduler for Managing GPU-Sharing Workloads on Multi-Instance GPUsHsu-Tzu Ting, Jerry Chou, Ming-Hung Chen, I-Hsin Chung2025-12-18下载Modern GPU workloads increasingly demand efficient resource sharing, as many jobs do not require the full capacity of a GPU. Among sharing techniques, NVIDIA's Multi-Instance GPU (MIG) offers strong r...
Cold-Start Anti-Patterns and Refactorings in Serverless Systems: An Empirical StudySyed Salauddin Mohammad Tariq, Foyzul Hassan, Amiangshu Bosu, Probir Roy2025-12-18下载Serverless computing simplifies deployment and scaling, yet cold-start latency remains a major performance bottleneck. Unlike prior work that treats mitigation as a black-box optimization, we study co...
Twinning for Space-Air-Ground-Sea Integrated Networks: Beyond Conventional Digital Twin Towards Goal-Oriented Semantic TwinYifei Qiu, Tianle Liao, Xin Jin, Qinyu Zhang, Shaohua Wu2025-12-18下载A space-air-ground-sea integrated network (SAGSIN) has emerged as a cornerstone of 6G systems, establishing a unified global architecture by integrating multi-domain network resources.
MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM ServicesLingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen2025-12-18下载The limited bandwidth of PCIe has emerged as the critical bottleneck for large language model (LLM) performance, such as prefix cache fetching and model switching.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
How to Discover Knowledge for FutureG: Contextual RAG and LLM Prompting for O-RANNathan Conger, Nathan Scollar, Kemal Davaslioglu, Yalin E. Sagduyu, Sastry Kompella2025-12-18下载We present a retrieval-augmented question answering framework for 5G/6G networks, where the Open Radio Access Network (O-RAN) has become central to disaggregated, virtualized, and AI-driven wireless s...
Coordinated Anti-Jamming Resilience in Swarm Networks via Multi-Agent Reinforcement LearningBahman Abolhassani, Tugba Erpek, Kemal Davaslioglu, Yalin E. Sagduyu, Sastry Kompella2025-12-18下载Reactive jammers pose a severe security threat to robotic-swarm networks by selectively disrupting inter-agent communications and undermining formation integrity and mission success.
Acoustic RIS for Massive Spatial Multiplexing: Unleashing Degrees of Freedom and Capacity in Underwater CommunicationsLongfei Zhao, Jingbo Tan, Jintao Wang, Ian F Akyildiz, Zhi Sun2025-12-18下载Underwater acoustic (UWA) communications are essential for high-speed marine data transmission but remain severely constrained by limited bandwidth, significant propagation loss, and sparse multipath ...
A Network Arena for Benchmarking AI Agents on Network TroubleshootingZhihao Wang, Alessandro Cornacchia, Alessio Sacco, Franco Galante, Marco Canini, Dingde Jiang2025-12-18下载Agentic systems, powered by Large Language Models (LLMs), assist network engineers with network configuration synthesis and network troubleshooting tasks.
Privacy-Aware Sharing of Raw Spatial Sensor Data for Cooperative PerceptionBangya Liu, Chengpo Yan, Chenghao Jiang, Suman Banerjee, Akarsh Prabhakara2025-12-18下载Cooperative perception between vehicles is poised to offer robust and reliable scene understanding. Recently, we are witnessing experimental systems research building testbeds that share raw spatial s...
MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM ServicesLingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen2025-12-18下载The limited bandwidth of PCIe has emerged as the critical bottleneck for large language model (LLM) performance, such as prefix cache fetching and model switching.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Trustworthy and Controllable Professional Knowledge Utilization in Large Language Models with TEE-GPU ExecutionYifeng Cai, Zhida An, Yuhan Meng, Houqian Liu, Pengli Wang, Hanwen Lei, Yao Guo, Ding Li2025-12-18下载Future improvements in large language model (LLM) services increasingly hinge on access to high-value professional knowledge rather than more generic web data.

cs.PF - Performance

标题作者发布日期PDF摘要
An Upper Bound on the M/M/k Queue With Deterministic Setup TimesJalani Williams, Weina Wang, Mor Harchol-Balter2025-12-18下载In many systems, servers do not turn on instantly; instead, a setup time must pass before a server can begin work. These "setup times" can wreak havoc on a system's queueing; this is especially true i...
XTC, A Research Platform for Optimizing AI Workload OperatorsPompougnac Hugo, Guillon Christophe, Noiry Sylvain, Dutilleul Alban, Iooss Guillaume, Rastello Fabrice2025-12-18下载Achieving high efficiency on AI operators demands precise control over computation and data movement. However, existing scheduling languages are locked into specific compiler ecosystems, preventing fa...
MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM ServicesLingfeng Tang, Daoping Zhang, Junjie Chen, Peihao Huang, Feng Jin, Chengguang Xu, Yuxin Chen, Feiqiang Sun, Guo Chen2025-12-18下载The limited bandwidth of PCIe has emerged as the critical bottleneck for large language model (LLM) performance, such as prefix cache fetching and model switching.

基于 VitePress 构建