Skip to content

2025-08-02

cs.AR - Architecture

标题作者发布日期PDF摘要
PiKV: KV Cache Management System for Mixture of ExpertsDong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu, Xuhong Wang2025-08-02下载As large language models continue to scale up in both size and context length, the memory and communication cost of key-value (KV) cache storage has become a major bottleneck in multi-GPU and multi-no...
A Dynamic Allocation Scheme for Adaptive Shared-Memory Mapping on Kilo-core RV Clusters for Attention-Based Model DeploymentBowen Wang, Marco Bertuletti, Yichao Zhang, Victor J. B. Jung, Luca Benini2025-08-02下载Attention-based models demand flexible hardware to manage diverse kernels with varying arithmetic intensities and memory access patterns. Large clusters with shared L1 memory, a common architectural p...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
An Analysis of HPC and Edge Architectures in the CloudSteven Santillan, Cristina L. Abad2025-08-02下载We analyze a recently published dataset of 396 real-world cloud architectures deployed on AWS, from companies belonging to a wide range of industries.
A Parallel Algorithm for Finding Robust Spanners in Large Social NetworksArindam Khanda, Satyaki Roy, Prithwiraj Roy, Sajal K. Das2025-08-02下载Social networks, characterized by community structures, often rely on nodes called structural hole spanners to facilitate inter-community information dissemination.
Nakamoto Consensus from Multiple ResourcesMirza Ahad Baig, Christoph U. Günther, Krzysztof Pietrzak2025-08-02下载The blocks in the Bitcoin blockchain record the amount of work W that went into creating them through proofs of work. When honest parties control a majority of the work, consensus is achieved by picki...
Deterministic Fault-Tolerant Local Load Balancing and its Applications against Adaptive AdversariesDariusz R. Kowalski, Jan Olkowski2025-08-02下载Load balancing is among the basic primitives in distributed computing. In this paper, we consider this problem when executed locally on a network with nodes prone to failures.
Semantic-Aware LLM Orchestration for Proactive Resource Management in Predictive Digital Twin Vehicular NetworksSeyed Hossein Ahmadpanah2025-08-02下载Next-generation automotive applications require vehicular edge computing (VEC), but current management systems are essentially fixed and reactive.
Agentic TinyML for Intent-aware Handover in 6G Wireless NetworksAlaa Saleh, Roberto Morabito, Sasu Tarkoma, Anders Lindgren, Susanna Pirttikangas, Lauri Lovén2025-08-02下载As 6G networks evolve into increasingly AI-driven, user-centric ecosystems, traditional reactive handover mechanisms demonstrate limitations, especially in mobile edge computing and autonomous agent-b...
PiKV: KV Cache Management System for Mixture of ExpertsDong Liu, Yanxuan Yu, Ben Lengerich, Ying Nian Wu, Xuhong Wang2025-08-02下载As large language models continue to scale up in both size and context length, the memory and communication cost of key-value (KV) cache storage has become a major bottleneck in multi-GPU and multi-no...
CarbonScaling: Extending Neural Scaling Laws for Carbon Footprint in Large Language ModelsLei Jiang, Fan Chen2025-08-02下载Neural scaling laws have driven the development of increasingly large language models (LLMs) by linking accuracy improvements to growth in parameter count, dataset size, and compute.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
VWAttacker: A Systematic Security Testing Framework for Voice over WiFi User EquipmentsImtiaz Karim, Hyunwoo Lee, Hassan Asghar, Kazi Samin Mubasshir, Seulgi Han, Mashroor Hasan Bhuiyan, Elisa Bertino2025-08-02下载We present VWAttacker, the first systematic testing framework for analyzing the security of Voice over WiFi (VoWiFi) User Equipment (UE) implementations.
Improving performance of content-centric networks via decentralized coded caching for multi-level popularity and accessAzadeh Sadat Miraftab, Ahmadreza Montazerolghaem, Behrad Mahboobi2025-08-02下载Content-Centric Networking (CCN) offers a novel architectural paradigm that seeks to address the inherent limitations of the prevailing Internet Protocol (IP)-based networking model.
Semantic-Aware LLM Orchestration for Proactive Resource Management in Predictive Digital Twin Vehicular NetworksSeyed Hossein Ahmadpanah2025-08-02下载Next-generation automotive applications require vehicular edge computing (VEC), but current management systems are essentially fixed and reactive.
Agentic TinyML for Intent-aware Handover in 6G Wireless NetworksAlaa Saleh, Roberto Morabito, Sasu Tarkoma, Anders Lindgren, Susanna Pirttikangas, Lauri Lovén2025-08-02下载As 6G networks evolve into increasingly AI-driven, user-centric ecosystems, traditional reactive handover mechanisms demonstrate limitations, especially in mobile edge computing and autonomous agent-b...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
AgentSight: System-Level Observability for AI Agents Using eBPFYusheng Zheng, Yanpeng Hu, Tong Yu, Andi Quinn2025-08-02下载Modern software infrastructure increasingly relies on LLM agents for development and maintenance, such as Claude Code and Gemini-cli. However, these AI agents differ fundamentally from traditional det...

cs.PF - Performance

标题作者发布日期PDF摘要
FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank ModelsZishan Shao, Yixiao Wang, Qinsi Wang, Ting Jiang, Zhixu Du, Hancheng Ye, Danyang Zhuo, Yiran Chen, Hai Li2025-08-02下载Singular Value Decomposition (SVD) has recently seen a surge of interest as a simple yet powerful tool for large language models (LLMs) compression, with a growing number of works demonstrating 20-80%...

基于 VitePress 构建