2025-08-02

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
PiKV: KV Cache Management System for Mixture of Experts	Dong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu, Xuhong Wang	2025-08-02	下载	As large language models continue to scale up in both size and context length, the memory and communication cost of key-value (KV) cache storage has become a major bottleneck in multi-GPU and multi-no...
A Dynamic Allocation Scheme for Adaptive Shared-Memory Mapping on Kilo-core RV Clusters for Attention-Based Model Deployment	Bowen Wang, Marco Bertuletti, Yichao Zhang, Victor J. B. Jung, Luca Benini	2025-08-02	下载	Attention-based models demand flexible hardware to manage diverse kernels with varying arithmetic intensities and memory access patterns. Large clusters with shared L1 memory, a common architectural p...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
An Analysis of HPC and Edge Architectures in the Cloud	Steven Santillan, Cristina L. Abad	2025-08-02	下载	We analyze a recently published dataset of 396 real-world cloud architectures deployed on AWS, from companies belonging to a wide range of industries.
A Parallel Algorithm for Finding Robust Spanners in Large Social Networks	Arindam Khanda, Satyaki Roy, Prithwiraj Roy, Sajal K. Das	2025-08-02	下载	Social networks, characterized by community structures, often rely on nodes called structural hole spanners to facilitate inter-community information dissemination.
Nakamoto Consensus from Multiple Resources	Mirza Ahad Baig, Christoph U. Günther, Krzysztof Pietrzak	2025-08-02	下载	The blocks in the Bitcoin blockchain record the amount of work W that went into creating them through proofs of work. When honest parties control a majority of the work, consensus is achieved by picki...
Deterministic Fault-Tolerant Local Load Balancing and its Applications against Adaptive Adversaries	Dariusz R. Kowalski, Jan Olkowski	2025-08-02	下载	Load balancing is among the basic primitives in distributed computing. In this paper, we consider this problem when executed locally on a network with nodes prone to failures.
Semantic-Aware LLM Orchestration for Proactive Resource Management in Predictive Digital Twin Vehicular Networks	Seyed Hossein Ahmadpanah	2025-08-02	下载	Next-generation automotive applications require vehicular edge computing (VEC), but current management systems are essentially fixed and reactive.
Agentic TinyML for Intent-aware Handover in 6G Wireless Networks	Alaa Saleh, Roberto Morabito, Sasu Tarkoma, Anders Lindgren, Susanna Pirttikangas, Lauri Lovén	2025-08-02	下载	As 6G networks evolve into increasingly AI-driven, user-centric ecosystems, traditional reactive handover mechanisms demonstrate limitations, especially in mobile edge computing and autonomous agent-b...
PiKV: KV Cache Management System for Mixture of Experts	Dong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu, Xuhong Wang	2025-08-02	下载	As large language models continue to scale up in both size and context length, the memory and communication cost of key-value (KV) cache storage has become a major bottleneck in multi-GPU and multi-no...
CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language Models	Lei Jiang, Fan Chen	2025-08-02	下载	Neural scaling laws have driven the development of increasingly large language models (LLMs) by linking accuracy improvements to growth in parameter count, dataset size, and compute.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
VWAttacker: A Systematic Security Testing Framework for Voice over WiFi User Equipments	Imtiaz Karim, Hyunwoo Lee, Hassan Asghar, Kazi Samin Mubasshir, Seulgi Han, Mashroor Hasan Bhuiyan, Elisa Bertino	2025-08-02	下载	We present VWAttacker, the first systematic testing framework for analyzing the security of Voice over WiFi (VoWiFi) User Equipment (UE) implementations.
Improving performance of content-centric networks via decentralized coded caching for multi-level popularity and access	Azadeh Sadat Miraftab, Ahmadreza Montazerolghaem, Behrad Mahboobi	2025-08-02	下载	Content-Centric Networking (CCN) offers a novel architectural paradigm that seeks to address the inherent limitations of the prevailing Internet Protocol (IP)-based networking model.
Semantic-Aware LLM Orchestration for Proactive Resource Management in Predictive Digital Twin Vehicular Networks	Seyed Hossein Ahmadpanah	2025-08-02	下载	Next-generation automotive applications require vehicular edge computing (VEC), but current management systems are essentially fixed and reactive.
Agentic TinyML for Intent-aware Handover in 6G Wireless Networks	Alaa Saleh, Roberto Morabito, Sasu Tarkoma, Anders Lindgren, Susanna Pirttikangas, Lauri Lovén	2025-08-02	下载	As 6G networks evolve into increasingly AI-driven, user-centric ecosystems, traditional reactive handover mechanisms demonstrate limitations, especially in mobile edge computing and autonomous agent-b...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
AgentSight: System-Level Observability for AI Agents Using eBPF	Yusheng Zheng, Yanpeng Hu, Tong Yu, Andi Quinn	2025-08-02	下载	Modern software infrastructure increasingly relies on LLM agents for development and maintenance, such as Claude Code and Gemini-cli. However, these AI agents differ fundamentally from traditional det...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models	Zishan Shao, Yixiao Wang, Qinsi Wang, Ting Jiang, Zhixu Du, Hancheng Ye, Danyang Zhuo, Yiran Chen, Hai Li	2025-08-02	下载	Singular Value Decomposition (SVD) has recently seen a surge of interest as a simple yet powerful tool for large language models (LLMs) compression, with a growing number of works demonstrating 20-80%...