Skip to content

2025-10-13

cs.AR - Architecture

标题作者发布日期PDF摘要
Rescaling-Aware Training for Efficient Deployment of Deep Learning Models on Full-Integer HardwareLion Mueller, Alberto Garcia-Ortiz, Ardalan Najafi, Adam Fuks, Lennart Bamberg2025-10-13下载Integer AI inference significantly reduces computational complexity in embedded systems. Quantization-aware training (QAT) helps mitigate accuracy degradation associated with post-training quantizatio...
Efficient In-Memory Acceleration of Sparse Block Diagonal LLMsJoão Paulo Cardoso de Lima, Marc Dietrich, Jeronimo Castrillon, Asif Ali Khan2025-10-13下载Structured sparsity enables deploying large language models (LLMs) on resource-constrained systems. Approaches like dense-to-sparse fine-tuning are particularly compelling, achieving remarkable struct...
Improving AI Efficiency in Data Centres by Power Dynamic ResponseAndrea Marinoni, Sai Shivareddy, Pietro Lio', Weisi Lin, Erik Cambria, Clare Grey2025-10-13下载The steady growth of artificial intelligence (AI) has accelerated in the recent years, facilitated by the development of sophisticated models such as large language models and foundation models.
FeNOMS: Enhancing Open Modification Spectral Library Search with In-Storage Processing on Ferroelectric NAND (FeNAND) FlashSumukh Pinge, Ashkan Moradifirouzabadi, Keming Fan, Prasanna Venkatesan Ravindran, Tanvir H. Pantha, Po-Kai Hsu, Zheyu Li, Weihong Xu, Zihan Xia, Flavio Ponzina, Winston Chern, Taeyoung Song, Priyankka Ravikumar, Mengkun Tian, Lance Fernandes, Huy Tran, Hari Jayasankar, Hang Chen, Chinsung Park, Amrit Garlapati, Kijoon Kim, Jongho Woo, Suhwan Lim, Kwangsoo Kim, Wanki Kim, Daewon Ha, Duygu Kuzum, Shimeng Yu, Sourav Dutta, Asif Khan, Tajana Rosing, Mingu Kang2025-10-13下载The rapid expansion of mass spectrometry (MS) data, now exceeding hundreds of terabytes, poses significant challenges for efficient, large-scale library search - a critical component for drug discover...
A Joint Learning Approach to Hardware Caching and PrefetchingSamuel Yuan, Divyanshu Saxena, Jiayi Chen, Nihal Sharma, Aditya Akella2025-10-13下载Several learned policies have been proposed to replace heuristics for scheduling, caching, and other system components in modern systems. By leveraging diverse features, learning from historical trend...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
FlexPipe: Adapting Dynamic LLM Serving Through Inflight Pipeline Refactoring in Fragmented Serverless ClustersYanying Lin, Shijie Peng, Chengzhi Lu, Chengzhong Xu, Kejiang Ye2025-10-13下载Serving Large Language Models (LLMs) in production faces significant challenges from highly variable request patterns and severe resource fragmentation in serverless clusters.
Rationally Analyzing Shelby: Proving Incentive Compatibility in a Decentralized Storage NetworkMichael Crystal, Guy Goren, Scott Duke Kominers2025-10-13下载Decentralized storage is one of the most natural applications built on blockchains and a central component of the Web3 ecosystem. Yet despite a decade of active development -- from IPFS and Filecoin t...
A Fast-Converging Decentralized Approach to the Weighted Minimum Vertex Cover ProblemMatteo Mordacchini, Emanuele Carlini, Patrizio Dazzi2025-10-13下载We address the problem of computing a Minimum Weighted Vertex Cover (MWVC) in a decentralized network. MWVC, a classical NP-hard problem, is foundational in applications such as network monitoring and...
An Asynchronous Many-Task Algorithm for Unstructured SNS_{N} Transport on Shared Memory SystemsAlex Elwood, Tom Deakin, Justin Lovegrove, Chris Nelson2025-10-13下载Discrete ordinates SNS_N transport solvers on unstructured meshes pose a challenge to scale due to complex data dependencies, memory access patterns and a high-dimensional domain.
An Explorative Study on Distributed Computing Techniques in Training and Inference of Large Language ModelsSheikh Azizul Hakim, Saem Hasan2025-10-13下载Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language.
A Decentralized Microservice Scheduling Approach Using Service Mesh in Cloud-Edge SystemsYangyang Wen, Paul Townend, Per-Olov Östberg, Abel Souza, Clément Courageux-Sudan2025-10-13下载As microservice-based systems scale across the cloud-edge continuum, traditional centralized scheduling mechanisms increasingly struggle with latency, coordination overhead, and fault tolerance.
Improving AI Efficiency in Data Centres by Power Dynamic ResponseAndrea Marinoni, Sai Shivareddy, Pietro Lio', Weisi Lin, Erik Cambria, Clare Grey2025-10-13下载The steady growth of artificial intelligence (AI) has accelerated in the recent years, facilitated by the development of sophisticated models such as large language models and foundation models.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Stable and Fault-Tolerant Decentralized Traffic EngineeringArjun Devraj, Umesh Krishnaswamy, Ying Zhang, Karuna Grewal, Justin Hsu, Eva Tardos, Rachee Singh2025-10-13下载Cloud providers have recently decentralized their wide-area network traffic engineering (TE) systems to contain the impact of TE controller failures.
A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical ServicesVincenzo Norman Vitale, Antonia Maria Tulino, Andreas F. Molisch, Jaime Llorca2025-10-13下载Timely delivery of delay-sensitive information over dynamic, heterogeneous networks is increasingly essential for a range of interactive applications, such as industrial automation, self-driving vehic...
CIRSense: Rethinking WiFi Sensing with Channel Impulse ResponseRuiqi Kong, He Chen2025-10-13下载WiFi sensing based on channel state information (CSI) collected from commodity WiFi devices has shown great potential across a wide range of applications, including vital sign monitoring and indoor lo...
A protocol to reduce worst-case latency in deflection-based on-chip networksLeandro Soares Indrusiak2025-10-13下载We present a novel protocol that reduces worst-case packet latency in deflection-based on-chip interconnect networks. It enforces the deflection of the header of a packet but not its payload, resultin...
Network-Optimised Spiking Neural Network (NOS) Scheduling for 6G O-RAN: Spectral Margin and Delay-Tail ControlMuhammad Bilal, Xiaolong Xu2025-10-13下载This work presents a Network-Optimised Spiking (NOS) delay-aware scheduler for 6G radio access. The scheme couples a bounded two-state kernel to a clique-feasible proportional-fair (PF) grant head: th...
From Prompts to Packets: A View from the Network on ChatGPT, Copilot, and GeminiAntonio Montieri, Alfredo Nascita, Antonio Pescapè2025-10-13下载GenAI chatbots are now pervasive in digital ecosystems, fundamentally reshaping user interactions over the Internet. Their reliance on an always-online, cloud-centric operating model introduces novel ...
Age of Information-Aware Cognitive Shared Access Networks with Energy HarvestingGeorgios Smpokos, Dionysis Xenakis, Marios Kountouris, Nikolaos Pappas2025-10-13下载This study investigates a cognitive shared access network with energy harvesting capabilities operating under Age of Information (AoI) constraints for the primary user.
Visible Light Communication for Vehicular Networks: A TutorialPedro E. Gória Silva, Eduardo S. Lima, Jules M. Moualeu, Mohamed Korium, Pedro H. J. Nardelli2025-10-13下载The advent of the fifth-generation technology promises to bring about more vertical applications and emerging services that include vehicular networks and intelligent transportation systems (ITSs).
Graph Neural Network-Based Multicast Routing for On-Demand Streaming Services in 6G NetworksXiucheng Wang, Zien Wang, Nan Cheng, Wenchao Xu, Wei Quan, Xuemin Shen2025-10-13下载The increase of bandwidth-intensive applications in sixth-generation (6G) wireless networks, such as real-time volumetric streaming and multi-sensory extended reality, demands intelligent multicast ro...
Zephyrus: Scaling Gateways Beyond the Petabit-Era with DPU-Augmented Hierarchical Co-OffloadingYuemeng Xu, Haoran Chen, Jiarui Guo, Mingwei Cui, Qiuheng Yin, Cheng Dong, Daxiang Kang, Xian Wu, Chenmin Sun, Peng He, Yang Gao, Lirong Lai, Kai Wang, Hongyu Wu, Tong Yang, Xiyun Xu2025-10-13下载Operating at petabit-scale, ByteDance's cloud gateways are deployed at critical aggregation points to orchestrate a wide array of business traffic.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Detection of Performance Changes in MooBench Results Using Nyrkiö on GitHub ActionsShinhyung Yang, David Georg Reichelt, Henrik Ingo, Wilhelm Hasselbring2025-10-13下载In GitHub with its 518 million hosted projects, performance changes within these projects are highly relevant to the project's users. Although performance measurement is supported by GitHub CI/CD, per...

cs.PF - Performance

标题作者发布日期PDF摘要
A protocol to reduce worst-case latency in deflection-based on-chip networksLeandro Soares Indrusiak2025-10-13下载We present a novel protocol that reduces worst-case packet latency in deflection-based on-chip interconnect networks. It enforces the deflection of the header of a packet but not its payload, resultin...
Detection of Performance Changes in MooBench Results Using Nyrkiö on GitHub ActionsShinhyung Yang, David Georg Reichelt, Henrik Ingo, Wilhelm Hasselbring2025-10-13下载In GitHub with its 518 million hosted projects, performance changes within these projects are highly relevant to the project's users. Although performance measurement is supported by GitHub CI/CD, per...
A new 1/(1-ρ)-scaling bound for multiserver queues via a leave-one-out techniqueYige Hong2025-10-13下载Bounding the queue length in a multiserver queue is a central challenge in queueing theory. Even for the classical G/G/nG/G/n queue with homogeneous servers, it is highly non-trivial to derive a simple a...

基于 VitePress 构建