2025-10-17

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Cleaning up the Mess	Haocong Luo, Ataberk Olgun, Maria Makeenkova, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Onur Mutlu	2025-10-17	下载	A MICRO 2024 best paper runner-up publication (the Mess paper) with all three artifact badges awarded (including "Reproducible") proposes a new benchmark to evaluate real and simulated memory system p...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Enhancing reliability in AI inference services: An empirical study on real production incidents	Bhala Ranganathan, Mickey Zhang, Kai Wu	2025-10-17	下载	Hyperscale large language model (LLM) inference places extraordinary demands on cloud systems, where even brief failures can translate into significant user and business impact.
Towards a Blockchain-Based CI/CD Framework to Enhance Security in Cloud Environments	Sabbir M Saleh, Nazim Madhavji, John Steinbacher	2025-10-17	下载	Security is becoming a pivotal point in cloud platforms. Several divisions, such as business organisations, health care, government, etc., have experienced cyber-attacks on their infrastructures.
Funky: Cloud-Native FPGA Virtualization and Orchestration	Atsushi Koshiba, Charalampos Mainas, Pramod Bhatotia	2025-10-17	下载	The adoption of FPGAs in cloud-native environments is facing impediments due to FPGA limitations and CPU-oriented design of orchestrators, as they lack virtualization, isolation, and preemption suppor...
GLP: A Grassroots, Multiagent, Concurrent, Logic Programming Language	Ehud Shapiro	2025-10-17	下载	Grassroots platforms are distributed systems with multiple instances that can (1) operate independently of each other and of any global resource other than the network, and (2) coalesce into ever larg...
A Post-Quantum Lower Bound for the Distributed Lovász Local Lemma	Sebastian Brandt, Tim Göttlicher	2025-10-17	下载	In this work, we study the Lovász local lemma (LLL) problem in the area of distributed quantum computing, which has been the focus of attention of recent advances in quantum computing [STOC'24, STOC'2...
GOGH: Correlation-Guided Orchestration of GPUs in Heterogeneous Clusters	Ahmad Raeisi, Mahdi Dolati, Sina Darabi, Sadegh Talebi, Patrick Eugster, Ahmad Khonsari	2025-10-17	下载	The growing demand for computational resources in machine learning has made efficient resource allocation a critical challenge, especially in heterogeneous hardware clusters where devices vary in capa...
PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training	Alicia Golden, Michael Kuchnik, Samuel Hsia, Zachary DeVito, Gu-Yeon Wei, David Brooks, Carole-Jean Wu	2025-10-17	下载	Large model training beyond tens of thousands of GPUs is an uncharted territory. At such scales, disruptions to the training process are not a matter of if, but a matter of when -- a stochastic proces...
Retrofitting Service Dependency Discovery in Distributed Systems	Diogo Landau, Gijs Blanken, Jorge Barbosa, Nishant Saurabh	2025-10-17	下载	Modern distributed systems rely on complex networks of interconnected services, creating direct or indirect dependencies that can propagate faults and cause cascading failures.
Balancing Fairness and Performance in Multi-User Spark Workloads with Dynamic Scheduling (extended version)	Dāvis Kažemaks, Laurens Versluis, Burcu Kulahcioglu Ozkan, Jérémie Decouchant	2025-10-17	下载	Apache Spark is a widely adopted framework for large-scale data processing. However, in industrial analytics environments, Spark's built-in schedulers, such as FIFO and fair scheduling, struggle to ma...
(Almost) Perfect Discrete Iterative Load Balancing	Petra Berenbrink, Robert Elsässer, Tom Friedetzky, Hamed Hosseinpour, Dominik Kaaser, Peter Kling, Thomas Sauerwald	2025-10-17	下载	We consider discrete, iterative load balancing via matchings on arbitrary graphs. Initially each node holds a certain number of tokens, defining the load of the node, and the objective is to redistrib...
Cloud-Enabled Virtual Prototypes	Tim Kraus, Axel Sauer, Ingo Feldner	2025-10-17	下载	The rapid evolution of embedded systems, along with the growing variety and complexity of AI algorithms, necessitates a powerful hardware/software co-design methodology based on virtual prototyping te...
BeLLMan: Controlling LLM Congestion	Tella Rajashekhar Reddy, Atharva Deshmukh, Karan Tandon, Rohan Gandhi, Anjaly Parayil, Debopam Bhattacherjee	2025-10-17	下载	Large language model (LLM) applications are blindfolded to the infrastructure underneath and generate tokens autoregressively, indifferent to the system load, thus risking inferencing latency inflatio...
Synera: Synergistic LLM Serving across Device and Cloud at Scale	Genglin Wang, Liekang Zeng, Bufang Yang, Kaiwei Liu, Guoliang Xing, Chumin Sun, Li Zhou, Jie Sun, Zhenyu Yan	2025-10-17	下载	Large Language Models (LLMs) are becoming key components in various mobile operating systems, driving smart applications like interactive chatbots and personal assistants.
A Multi-Cloud Framework for Zero-Trust Workload Authentication	Saurabh Deochake, Ryan Murphy, Jeremiah Gearheart	2025-10-17	下载	Static, long-lived credentials for workload authentication create untenable security risks that violate Zero-Trust principles. This paper presents a multi-cloud framework using Workload Identity Feder...
Spatiotemporal Traffic Prediction in Distributed Backend Systems via Graph Neural Networks	Zhimin Qiu, Feng Liu, Yuxiao Wang, Chenrui Hu, Ziyu Cheng, Di Wu	2025-10-17	下载	This paper addresses the problem of traffic prediction in distributed backend systems and proposes a graph neural network based modeling approach to overcome the limitations of traditional models in c...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Agentic AI for Ultra-Modern Networks: Multi-Agent Framework for RAN Autonomy and Assurance	Sukhdeep Singh, Avinash Bhat, Shweta M, Subhash K Singh, Moonki Hong, Madhan Raj K, Kandeepan Sithamparanathan, Sunder A. Khowaja, Kapal Dev	2025-10-17	下载	The increasing complexity of Beyond 5G and 6G networks necessitates new paradigms for autonomy and assur- ance. Traditional O-RAN control loops rely heavily on RIC- based orchestration, which centrali...
Uno: A One-Stop Solution for Inter- and Intra-Datacenter Congestion Control and Reliable Connectivity	Tommaso Bonato, Sepehr Abdous, Abdul Kabbani, Ahmad Ghalayini, Nadeen Gebara, Terry Lam, Anup Agarwal, Tiancheng Chen, Zhuolong Yu, Konstantin Taranov, Mahmoud Elhaddad, Daniele De Sensi, Soudeh Ghorbani, Torsten Hoefler	2025-10-17	下载	Cloud computing and AI workloads are driving unprecedented demand for efficient communication within and across datacenters. However, the coexistence of intra- and inter-datacenter traffic within data...
Ambusher: Exploring the Security of Distributed SDN Controllers Through Protocol State Fuzzing	Jinwoo Kim, Minjae Seo, Eduard Marin, Seungsoo Lee, Jaehyun Nam, Seungwon Shin	2025-10-17	下载	Distributed SDN (Software-Defined Networking) controllers have rapidly become an integral element of Wide Area Networks (WAN), particularly within SD-WAN, providing scalability and fault-tolerance for...
Flexible Qubit Allocation of Network Resource States	Francesco Mazza, Jorge Miguel-Ramiro, Jessica Illiano, Alexander Pirker, Marcello Caleffi, Angela Sara Cacciapuoti, Wolfgang Dür	2025-10-17	下载	The Quantum Internet is still in its infancy, yet identifying scalable and resilient quantum network resource states is an essential task for realizing it.
Content and Access Networks Synergies: Tradeoffs in Public and Private Investments by Content Providers	Pranay Agarwal, D. Manjunath	2025-10-17	下载	The ubiquity of smartphones has fueled content consumption worldwide, leading to an ever-increasing demand for a better Internet experience. This has necessitated an upgrade of the capacity of the acc...
BeLLMan: Controlling LLM Congestion	Tella Rajashekhar Reddy, Atharva Deshmukh, Karan Tandon, Rohan Gandhi, Anjaly Parayil, Debopam Bhattacherjee	2025-10-17	下载	Large language model (LLM) applications are blindfolded to the infrastructure underneath and generate tokens autoregressively, indifferent to the system load, thus risking inferencing latency inflatio...
A Multi-Cloud Framework for Zero-Trust Workload Authentication	Saurabh Deochake, Ryan Murphy, Jeremiah Gearheart	2025-10-17	下载	Static, long-lived credentials for workload authentication create untenable security risks that violate Zero-Trust principles. This paper presents a multi-cloud framework using Workload Identity Feder...
Structural Generalization for Microservice Routing Using Graph Neural Networks	Chenrui Hu, Ziyu Cheng, Di Wu, Yuxiao Wang, Feng Liu, Zhimin Qiu	2025-10-17	下载	This paper focuses on intelligent routing in microservice systems and proposes an end-to-end optimization framework based on graph neural networks.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Funky: Cloud-Native FPGA Virtualization and Orchestration	Atsushi Koshiba, Charalampos Mainas, Pramod Bhatotia	2025-10-17	下载	The adoption of FPGAs in cloud-native environments is facing impediments due to FPGA limitations and CPU-oriented design of orchestrators, as they lack virtualization, isolation, and preemption suppor...
Maratona Linux a tale of upgrading from Ubuntu 20.04 to 22.04	Davi Antônio da Silva Santos, Bruno César Ribas	2025-10-17	下载	Maratona Linux is the development environment used since 2016 on the ``Maratona de Programação'', ICPC's South American regional contest. It consists of Debian packages that modify a standard Ubuntu i...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Cleaning up the Mess	Haocong Luo, Ataberk Olgun, Maria Makeenkova, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Onur Mutlu	2025-10-17	下载	A MICRO 2024 best paper runner-up publication (the Mess paper) with all three artifact badges awarded (including "Reproducible") proposes a new benchmark to evaluate real and simulated memory system p...
An Experimental Study of Real-Life LLM-Proposed Performance Improvements	Lirong Yi, Gregory Gay, Philipp Leitner	2025-10-17	下载	Large Language Models (LLMs) can generate code, but can they generate fast code? In this paper, we study this question using a dataset of 65 real-world tasks mined from open-source Java programs.
Impact of AI-Triage on Radiologist Report Turnaround Time: Real-World Time-Savings and Insights from Model Predictions	Yee Lam Elim Thompson, Jonathan Fergus, Jonathan Chung, Jana G. Delfino, Weijie Chen, Gary M. Levine, Frank W. Samuelson	2025-10-17	下载	Objective: To quantify the impact of workflow parameters on time-savings in report turnaround time (TAT) due to an AI-triage device that prioritized pulmonary embolism (PE) in chest CT pulmonary angio...