2025-10-13

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Rescaling-Aware Training for Efficient Deployment of Deep Learning Models on Full-Integer Hardware	Lion Mueller, Alberto Garcia-Ortiz, Ardalan Najafi, Adam Fuks, Lennart Bamberg	2025-10-13	下载	Integer AI inference significantly reduces computational complexity in embedded systems. Quantization-aware training (QAT) helps mitigate accuracy degradation associated with post-training quantizatio...
Efficient In-Memory Acceleration of Sparse Block Diagonal LLMs	João Paulo Cardoso de Lima, Marc Dietrich, Jeronimo Castrillon, Asif Ali Khan	2025-10-13	下载	Structured sparsity enables deploying large language models (LLMs) on resource-constrained systems. Approaches like dense-to-sparse fine-tuning are particularly compelling, achieving remarkable struct...
Improving AI Efficiency in Data Centres by Power Dynamic Response	Andrea Marinoni, Sai Shivareddy, Pietro Lio', Weisi Lin, Erik Cambria, Clare Grey	2025-10-13	下载	The steady growth of artificial intelligence (AI) has accelerated in the recent years, facilitated by the development of sophisticated models such as large language models and foundation models.
FeNOMS: Enhancing Open Modification Spectral Library Search with In-Storage Processing on Ferroelectric NAND (FeNAND) Flash	Sumukh Pinge, Ashkan Moradifirouzabadi, Keming Fan, Prasanna Venkatesan Ravindran, Tanvir H. Pantha, Po-Kai Hsu, Zheyu Li, Weihong Xu, Zihan Xia, Flavio Ponzina, Winston Chern, Taeyoung Song, Priyankka Ravikumar, Mengkun Tian, Lance Fernandes, Huy Tran, Hari Jayasankar, Hang Chen, Chinsung Park, Amrit Garlapati, Kijoon Kim, Jongho Woo, Suhwan Lim, Kwangsoo Kim, Wanki Kim, Daewon Ha, Duygu Kuzum, Shimeng Yu, Sourav Dutta, Asif Khan, Tajana Rosing, Mingu Kang	2025-10-13	下载	The rapid expansion of mass spectrometry (MS) data, now exceeding hundreds of terabytes, poses significant challenges for efficient, large-scale library search - a critical component for drug discover...
A Joint Learning Approach to Hardware Caching and Prefetching	Samuel Yuan, Divyanshu Saxena, Jiayi Chen, Nihal Sharma, Aditya Akella	2025-10-13	下载	Several learned policies have been proposed to replace heuristics for scheduling, caching, and other system components in modern systems. By leveraging diverse features, learning from historical trend...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
FlexPipe: Adapting Dynamic LLM Serving Through Inflight Pipeline Refactoring in Fragmented Serverless Clusters	Yanying Lin, Shijie Peng, Chengzhi Lu, Chengzhong Xu, Kejiang Ye	2025-10-13	下载	Serving Large Language Models (LLMs) in production faces significant challenges from highly variable request patterns and severe resource fragmentation in serverless clusters.
Rationally Analyzing Shelby: Proving Incentive Compatibility in a Decentralized Storage Network	Michael Crystal, Guy Goren, Scott Duke Kominers	2025-10-13	下载	Decentralized storage is one of the most natural applications built on blockchains and a central component of the Web3 ecosystem. Yet despite a decade of active development -- from IPFS and Filecoin t...
A Fast-Converging Decentralized Approach to the Weighted Minimum Vertex Cover Problem	Matteo Mordacchini, Emanuele Carlini, Patrizio Dazzi	2025-10-13	下载	We address the problem of computing a Minimum Weighted Vertex Cover (MWVC) in a decentralized network. MWVC, a classical NP-hard problem, is foundational in applications such as network monitoring and...
An Asynchronous Many-Task Algorithm for Unstructured $S_{N}$ Transport on Shared Memory Systems	Alex Elwood, Tom Deakin, Justin Lovegrove, Chris Nelson	2025-10-13	下载	Discrete ordinates $S_N$ transport solvers on unstructured meshes pose a challenge to scale due to complex data dependencies, memory access patterns and a high-dimensional domain.
An Explorative Study on Distributed Computing Techniques in Training and Inference of Large Language Models	Sheikh Azizul Hakim, Saem Hasan	2025-10-13	下载	Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language.
A Decentralized Microservice Scheduling Approach Using Service Mesh in Cloud-Edge Systems	Yangyang Wen, Paul Townend, Per-Olov Östberg, Abel Souza, Clément Courageux-Sudan	2025-10-13	下载	As microservice-based systems scale across the cloud-edge continuum, traditional centralized scheduling mechanisms increasingly struggle with latency, coordination overhead, and fault tolerance.
Improving AI Efficiency in Data Centres by Power Dynamic Response	Andrea Marinoni, Sai Shivareddy, Pietro Lio', Weisi Lin, Erik Cambria, Clare Grey	2025-10-13	下载	The steady growth of artificial intelligence (AI) has accelerated in the recent years, facilitated by the development of sophisticated models such as large language models and foundation models.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Stable and Fault-Tolerant Decentralized Traffic Engineering	Arjun Devraj, Umesh Krishnaswamy, Ying Zhang, Karuna Grewal, Justin Hsu, Eva Tardos, Rachee Singh	2025-10-13	下载	Cloud providers have recently decentralized their wide-area network traffic engineering (TE) systems to contain the impact of TE controller failures.
A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services	Vincenzo Norman Vitale, Antonia Maria Tulino, Andreas F. Molisch, Jaime Llorca	2025-10-13	下载	Timely delivery of delay-sensitive information over dynamic, heterogeneous networks is increasingly essential for a range of interactive applications, such as industrial automation, self-driving vehic...
CIRSense: Rethinking WiFi Sensing with Channel Impulse Response	Ruiqi Kong, He Chen	2025-10-13	下载	WiFi sensing based on channel state information (CSI) collected from commodity WiFi devices has shown great potential across a wide range of applications, including vital sign monitoring and indoor lo...
A protocol to reduce worst-case latency in deflection-based on-chip networks	Leandro Soares Indrusiak	2025-10-13	下载	We present a novel protocol that reduces worst-case packet latency in deflection-based on-chip interconnect networks. It enforces the deflection of the header of a packet but not its payload, resultin...
Network-Optimised Spiking Neural Network (NOS) Scheduling for 6G O-RAN: Spectral Margin and Delay-Tail Control	Muhammad Bilal, Xiaolong Xu	2025-10-13	下载	This work presents a Network-Optimised Spiking (NOS) delay-aware scheduler for 6G radio access. The scheme couples a bounded two-state kernel to a clique-feasible proportional-fair (PF) grant head: th...
From Prompts to Packets: A View from the Network on ChatGPT, Copilot, and Gemini	Antonio Montieri, Alfredo Nascita, Antonio Pescapè	2025-10-13	下载	GenAI chatbots are now pervasive in digital ecosystems, fundamentally reshaping user interactions over the Internet. Their reliance on an always-online, cloud-centric operating model introduces novel ...
Age of Information-Aware Cognitive Shared Access Networks with Energy Harvesting	Georgios Smpokos, Dionysis Xenakis, Marios Kountouris, Nikolaos Pappas	2025-10-13	下载	This study investigates a cognitive shared access network with energy harvesting capabilities operating under Age of Information (AoI) constraints for the primary user.
Visible Light Communication for Vehicular Networks: A Tutorial	Pedro E. Gória Silva, Eduardo S. Lima, Jules M. Moualeu, Mohamed Korium, Pedro H. J. Nardelli	2025-10-13	下载	The advent of the fifth-generation technology promises to bring about more vertical applications and emerging services that include vehicular networks and intelligent transportation systems (ITSs).
Graph Neural Network-Based Multicast Routing for On-Demand Streaming Services in 6G Networks	Xiucheng Wang, Zien Wang, Nan Cheng, Wenchao Xu, Wei Quan, Xuemin Shen	2025-10-13	下载	The increase of bandwidth-intensive applications in sixth-generation (6G) wireless networks, such as real-time volumetric streaming and multi-sensory extended reality, demands intelligent multicast ro...
Zephyrus: Scaling Gateways Beyond the Petabit-Era with DPU-Augmented Hierarchical Co-Offloading	Yuemeng Xu, Haoran Chen, Jiarui Guo, Mingwei Cui, Qiuheng Yin, Cheng Dong, Daxiang Kang, Xian Wu, Chenmin Sun, Peng He, Yang Gao, Lirong Lai, Kai Wang, Hongyu Wu, Tong Yang, Xiyun Xu	2025-10-13	下载	Operating at petabit-scale, ByteDance's cloud gateways are deployed at critical aggregation points to orchestrate a wide array of business traffic.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Detection of Performance Changes in MooBench Results Using Nyrkiö on GitHub Actions	Shinhyung Yang, David Georg Reichelt, Henrik Ingo, Wilhelm Hasselbring	2025-10-13	下载	In GitHub with its 518 million hosted projects, performance changes within these projects are highly relevant to the project's users. Although performance measurement is supported by GitHub CI/CD, per...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
A protocol to reduce worst-case latency in deflection-based on-chip networks	Leandro Soares Indrusiak	2025-10-13	下载	We present a novel protocol that reduces worst-case packet latency in deflection-based on-chip interconnect networks. It enforces the deflection of the header of a packet but not its payload, resultin...
Detection of Performance Changes in MooBench Results Using Nyrkiö on GitHub Actions	Shinhyung Yang, David Georg Reichelt, Henrik Ingo, Wilhelm Hasselbring	2025-10-13	下载	In GitHub with its 518 million hosted projects, performance changes within these projects are highly relevant to the project's users. Although performance measurement is supported by GitHub CI/CD, per...
A new 1/(1-ρ)-scaling bound for multiserver queues via a leave-one-out technique	Yige Hong	2025-10-13	下载	Bounding the queue length in a multiserver queue is a central challenge in queueing theory. Even for the classical $G/G/n$ queue with homogeneous servers, it is highly non-trivial to derive a simple a...