Skip to content

2025-10-17

cs.AR - Architecture

标题作者发布日期PDF摘要
Cleaning up the MessHaocong Luo, Ataberk Olgun, Maria Makeenkova, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Onur Mutlu2025-10-17下载A MICRO 2024 best paper runner-up publication (the Mess paper) with all three artifact badges awarded (including "Reproducible") proposes a new benchmark to evaluate real and simulated memory system p...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Enhancing reliability in AI inference services: An empirical study on real production incidentsBhala Ranganathan, Mickey Zhang, Kai Wu2025-10-17下载Hyperscale large language model (LLM) inference places extraordinary demands on cloud systems, where even brief failures can translate into significant user and business impact.
Towards a Blockchain-Based CI/CD Framework to Enhance Security in Cloud EnvironmentsSabbir M Saleh, Nazim Madhavji, John Steinbacher2025-10-17下载Security is becoming a pivotal point in cloud platforms. Several divisions, such as business organisations, health care, government, etc., have experienced cyber-attacks on their infrastructures.
Funky: Cloud-Native FPGA Virtualization and OrchestrationAtsushi Koshiba, Charalampos Mainas, Pramod Bhatotia2025-10-17下载The adoption of FPGAs in cloud-native environments is facing impediments due to FPGA limitations and CPU-oriented design of orchestrators, as they lack virtualization, isolation, and preemption suppor...
GLP: A Grassroots, Multiagent, Concurrent, Logic Programming LanguageEhud Shapiro2025-10-17下载Grassroots platforms are distributed systems with multiple instances that can (1) operate independently of each other and of any global resource other than the network, and (2) coalesce into ever larg...
A Post-Quantum Lower Bound for the Distributed Lovász Local LemmaSebastian Brandt, Tim Göttlicher2025-10-17下载In this work, we study the Lovász local lemma (LLL) problem in the area of distributed quantum computing, which has been the focus of attention of recent advances in quantum computing [STOC'24, STOC'2...
GOGH: Correlation-Guided Orchestration of GPUs in Heterogeneous ClustersAhmad Raeisi, Mahdi Dolati, Sina Darabi, Sadegh Talebi, Patrick Eugster, Ahmad Khonsari2025-10-17下载The growing demand for computational resources in machine learning has made efficient resource allocation a critical challenge, especially in heterogeneous hardware clusters where devices vary in capa...
PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed TrainingAlicia Golden, Michael Kuchnik, Samuel Hsia, Zachary DeVito, Gu-Yeon Wei, David Brooks, Carole-Jean Wu2025-10-17下载Large model training beyond tens of thousands of GPUs is an uncharted territory. At such scales, disruptions to the training process are not a matter of if, but a matter of when -- a stochastic proces...
Retrofitting Service Dependency Discovery in Distributed SystemsDiogo Landau, Gijs Blanken, Jorge Barbosa, Nishant Saurabh2025-10-17下载Modern distributed systems rely on complex networks of interconnected services, creating direct or indirect dependencies that can propagate faults and cause cascading failures.
Balancing Fairness and Performance in Multi-User Spark Workloads with Dynamic Scheduling (extended version)Dāvis Kažemaks, Laurens Versluis, Burcu Kulahcioglu Ozkan, Jérémie Decouchant2025-10-17下载Apache Spark is a widely adopted framework for large-scale data processing. However, in industrial analytics environments, Spark's built-in schedulers, such as FIFO and fair scheduling, struggle to ma...
(Almost) Perfect Discrete Iterative Load BalancingPetra Berenbrink, Robert Elsässer, Tom Friedetzky, Hamed Hosseinpour, Dominik Kaaser, Peter Kling, Thomas Sauerwald2025-10-17下载We consider discrete, iterative load balancing via matchings on arbitrary graphs. Initially each node holds a certain number of tokens, defining the load of the node, and the objective is to redistrib...
Cloud-Enabled Virtual PrototypesTim Kraus, Axel Sauer, Ingo Feldner2025-10-17下载The rapid evolution of embedded systems, along with the growing variety and complexity of AI algorithms, necessitates a powerful hardware/software co-design methodology based on virtual prototyping te...
BeLLMan: Controlling LLM CongestionTella Rajashekhar Reddy, Atharva Deshmukh, Karan Tandon, Rohan Gandhi, Anjaly Parayil, Debopam Bhattacherjee2025-10-17下载Large language model (LLM) applications are blindfolded to the infrastructure underneath and generate tokens autoregressively, indifferent to the system load, thus risking inferencing latency inflatio...
Synera: Synergistic LLM Serving across Device and Cloud at ScaleGenglin Wang, Liekang Zeng, Bufang Yang, Kaiwei Liu, Guoliang Xing, Chumin Sun, Li Zhou, Jie Sun, Zhenyu Yan2025-10-17下载Large Language Models (LLMs) are becoming key components in various mobile operating systems, driving smart applications like interactive chatbots and personal assistants.
A Multi-Cloud Framework for Zero-Trust Workload AuthenticationSaurabh Deochake, Ryan Murphy, Jeremiah Gearheart2025-10-17下载Static, long-lived credentials for workload authentication create untenable security risks that violate Zero-Trust principles. This paper presents a multi-cloud framework using Workload Identity Feder...
Spatiotemporal Traffic Prediction in Distributed Backend Systems via Graph Neural NetworksZhimin Qiu, Feng Liu, Yuxiao Wang, Chenrui Hu, Ziyu Cheng, Di Wu2025-10-17下载This paper addresses the problem of traffic prediction in distributed backend systems and proposes a graph neural network based modeling approach to overcome the limitations of traditional models in c...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Agentic AI for Ultra-Modern Networks: Multi-Agent Framework for RAN Autonomy and AssuranceSukhdeep Singh, Avinash Bhat, Shweta M, Subhash K Singh, Moonki Hong, Madhan Raj K, Kandeepan Sithamparanathan, Sunder A. Khowaja, Kapal Dev2025-10-17下载The increasing complexity of Beyond 5G and 6G networks necessitates new paradigms for autonomy and assur- ance. Traditional O-RAN control loops rely heavily on RIC- based orchestration, which centrali...
Uno: A One-Stop Solution for Inter- and Intra-Datacenter Congestion Control and Reliable ConnectivityTommaso Bonato, Sepehr Abdous, Abdul Kabbani, Ahmad Ghalayini, Nadeen Gebara, Terry Lam, Anup Agarwal, Tiancheng Chen, Zhuolong Yu, Konstantin Taranov, Mahmoud Elhaddad, Daniele De Sensi, Soudeh Ghorbani, Torsten Hoefler2025-10-17下载Cloud computing and AI workloads are driving unprecedented demand for efficient communication within and across datacenters. However, the coexistence of intra- and inter-datacenter traffic within data...
Ambusher: Exploring the Security of Distributed SDN Controllers Through Protocol State FuzzingJinwoo Kim, Minjae Seo, Eduard Marin, Seungsoo Lee, Jaehyun Nam, Seungwon Shin2025-10-17下载Distributed SDN (Software-Defined Networking) controllers have rapidly become an integral element of Wide Area Networks (WAN), particularly within SD-WAN, providing scalability and fault-tolerance for...
Flexible Qubit Allocation of Network Resource StatesFrancesco Mazza, Jorge Miguel-Ramiro, Jessica Illiano, Alexander Pirker, Marcello Caleffi, Angela Sara Cacciapuoti, Wolfgang Dür2025-10-17下载The Quantum Internet is still in its infancy, yet identifying scalable and resilient quantum network resource states is an essential task for realizing it.
Content and Access Networks Synergies: Tradeoffs in Public and Private Investments by Content ProvidersPranay Agarwal, D. Manjunath2025-10-17下载The ubiquity of smartphones has fueled content consumption worldwide, leading to an ever-increasing demand for a better Internet experience. This has necessitated an upgrade of the capacity of the acc...
BeLLMan: Controlling LLM CongestionTella Rajashekhar Reddy, Atharva Deshmukh, Karan Tandon, Rohan Gandhi, Anjaly Parayil, Debopam Bhattacherjee2025-10-17下载Large language model (LLM) applications are blindfolded to the infrastructure underneath and generate tokens autoregressively, indifferent to the system load, thus risking inferencing latency inflatio...
A Multi-Cloud Framework for Zero-Trust Workload AuthenticationSaurabh Deochake, Ryan Murphy, Jeremiah Gearheart2025-10-17下载Static, long-lived credentials for workload authentication create untenable security risks that violate Zero-Trust principles. This paper presents a multi-cloud framework using Workload Identity Feder...
Structural Generalization for Microservice Routing Using Graph Neural NetworksChenrui Hu, Ziyu Cheng, Di Wu, Yuxiao Wang, Feng Liu, Zhimin Qiu2025-10-17下载This paper focuses on intelligent routing in microservice systems and proposes an end-to-end optimization framework based on graph neural networks.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Funky: Cloud-Native FPGA Virtualization and OrchestrationAtsushi Koshiba, Charalampos Mainas, Pramod Bhatotia2025-10-17下载The adoption of FPGAs in cloud-native environments is facing impediments due to FPGA limitations and CPU-oriented design of orchestrators, as they lack virtualization, isolation, and preemption suppor...
Maratona Linux a tale of upgrading from Ubuntu 20.04 to 22.04Davi Antônio da Silva Santos, Bruno César Ribas2025-10-17下载Maratona Linux is the development environment used since 2016 on the ``Maratona de Programação'', ICPC's South American regional contest. It consists of Debian packages that modify a standard Ubuntu i...

cs.PF - Performance

标题作者发布日期PDF摘要
Cleaning up the MessHaocong Luo, Ataberk Olgun, Maria Makeenkova, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Onur Mutlu2025-10-17下载A MICRO 2024 best paper runner-up publication (the Mess paper) with all three artifact badges awarded (including "Reproducible") proposes a new benchmark to evaluate real and simulated memory system p...
An Experimental Study of Real-Life LLM-Proposed Performance ImprovementsLirong Yi, Gregory Gay, Philipp Leitner2025-10-17下载Large Language Models (LLMs) can generate code, but can they generate fast code? In this paper, we study this question using a dataset of 65 real-world tasks mined from open-source Java programs.
Impact of AI-Triage on Radiologist Report Turnaround Time: Real-World Time-Savings and Insights from Model PredictionsYee Lam Elim Thompson, Jonathan Fergus, Jonathan Chung, Jana G. Delfino, Weijie Chen, Gary M. Levine, Frank W. Samuelson2025-10-17下载Objective: To quantify the impact of workflow parameters on time-savings in report turnaround time (TAT) due to an AI-triage device that prioritized pulmonary embolism (PE) in chest CT pulmonary angio...

基于 VitePress 构建