2025-10-07

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost Approach	Debendra Das Sharma, Swadesh Choudhary, Peter Onufryk, Rob Pelt	2025-10-07	下载	Emerging computing applications such as Artificial Intelligence (AI) are facing a memory wall with existing on-package memory solutions that are unable to meet the power-efficient bandwidth demands.
An opportunity to improve Data Center Efficiency: Optimizing the Server's Upgrade Cycle	Panagiota Nikolaou, Freddy Gabbay, Jawad Haj-Yahya, Yiannakis Sazeides	2025-10-07	下载	This work aims to improve a data center's efficiency by optimizing the server upgrade plan: determine the optimal timing for replacing old servers with new ones.
Simopt-Power: Leveraging Simulation Metadata for Low-Power Design Synthesis	Eashan Wadhwa, Shanker Shreejith	2025-10-07	下载	Excessive switching activity is a primary contributor to dynamic power dissipation in modern FPGAs, where fine-grained configurability amplifies signal toggling and associated capacitance.
From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUs	Tianhao Zhu, Dahu Feng, Erhu Feng, Yubin Xia	2025-10-07	下载	With the widespread adoption of Large Language Models (LLMs), the demand for high-performance LLM inference services continues to grow. To meet this demand, a growing number of AI accelerators have be...
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference	Zhongkai Yu, Yue Guan, Zihao Yu, Chenyang Zhou, Zhengding Hu, Shuyi Pei, Yangwook Kang, Yufei Ding, Po-An Tsai	2025-10-07	下载	Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier open weight models, achieving remarkable model capability similar to proprietary ones.
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node Communications	Xi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, Dong Li	2025-10-07	下载	Message Passing Interface (MPI) is a foundational programming model for high-performance computing. MPI libraries traditionally employ network interconnects (e.g.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost Approach	Debendra Das Sharma, Swadesh Choudhary, Peter Onufryk, Rob Pelt	2025-10-07	下载	Emerging computing applications such as Artificial Intelligence (AI) are facing a memory wall with existing on-package memory solutions that are unable to meet the power-efficient bandwidth demands.
Context-Aware Inference via Performance Forecasting in Decentralized Learning Networks	Joel Pfeffer, J. M. Diederik Kruijssen, Clément Gossart, Mélanie Chevance, Diego Campo Millan, Florian Stecker, Steven N. Longmore	2025-10-07	下载	In decentralized learning networks, predictions from many participants are combined to generate a network inference. While many studies have demonstrated performance benefits of combining multiple mod...
MuFASA -- Asynchronous Checkpoint for Weakly Consistent Fully Replicated Databases	Raaghav Ravishankar, Sandeep Kulkarni, Nitin H Vaidya	2025-10-07	下载	We focus on the problem of checkpointing in fully replicated weakly consistent distributed databases, which we refer to as Distributed Transaction Consistent Snapshot (DTCS).
Adaptive Protein Design Protocols and Middleware	Aymen Alsaadi, Jonathan Ash, Mikhail Titov, Matteo Turilli, Andre Merzky, Shantenu Jha, Sagar Khare	2025-10-07	下载	Computational protein design is experiencing a transformation driven by AI/ML. However, the range of potential protein sequences and structures is astronomically vast, even for moderately sized protei...
Transforming Lock-free Linked Lists into Distributed Lock-free Linked Lists	Raaghav Ravishankar, Sandeep Kulkarni, Sathya Peri, Gokarna Sharma	2025-10-07	下载	Modern databases use dynamic search structures that store an enormous amount of data, and often serve them using multi-threaded algorithms to support the ever-increasing throughput needs.
Optimal Good-Case Latency for Sleepy Consensus	Yuval Efron, Joachim Neu, Ling Ren, Ertem Nusret Tas	2025-10-07	下载	In the context of Byzantine consensus problems such as Byzantine broadcast (BB) and Byzantine agreement (BA), the good-case setting aims to study the minimal possible latency of a BB or BA protocol un...
How many more is different?	Jacob Calvert, Andréa W. Richa, Dana Randall	2025-10-07	下载	From the formation of ice in small clusters of water molecules to the mass raids of army ant colonies, the emergent behavior of collectives depends critically on their size.
EARL: Efficient Agentic Reinforcement Learning Systems for Large Language Models	Zheyue Tan, Mustapha Abdullahi, Tuo Shi, Huining Yuan, Zelai Xu, Chao Yu, Boxun Li, Bo Zhao	2025-10-07	下载	Reinforcement learning (RL) has become a pivotal component of large language model (LLM) post-training, and agentic RL extends this paradigm to operate as agents through multi-turn interaction and too...
The Anatomy of a Triton Attention Kernel	Burkhard Ringlein, Jan van Lunteren, Radu Stoica, Thomas Parnell	2025-10-07	下载	A long-standing goal in both industry and academia is to develop an LLM inference platform that is portable across hardware architectures, eliminates the need for low-level hand-tuning, and still deli...
A Review of Ontology-Driven Big Data Analytics in Healthcare: Challenges, Tools, and Applications	Ritesh Chandra, Sonali Agarwal, Navjot Singh, Sadhana Tiwari	2025-10-07	下载	Exponential growth in heterogeneous healthcare data arising from electronic health records (EHRs), medical imaging, wearable sensors, and biomedical research has accelerated the adoption of data lakes...
Intertemporal Pricing of Time-Bound Stablecoins: Measuring and Controlling the Liquidity-of-Time Premium	Ailiya Borjigin, Cong He	2025-10-07	下载	Time-bound stablecoins are DeFi assets that temporarily tokenize traditional securities during market off-hours, enabling continuous cross-market liquidity.
Simopt-Power: Leveraging Simulation Metadata for Low-Power Design Synthesis	Eashan Wadhwa, Shanker Shreejith	2025-10-07	下载	Excessive switching activity is a primary contributor to dynamic power dissipation in modern FPGAs, where fine-grained configurability amplifies signal toggling and associated capacitance.
Decoupling Correctness from Policy: A Deterministic Causal Structure for Multi-Agent Systems	Zhiyuan Ren, Tao Zhang, Wenchi Chen	2025-10-07	下载	In distributed multi-agent systems, correctness is often entangled with operational policies such as scheduling, batching, or routing, which makes systems brittle since performance-driven policy evolu...
When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph Learning	Arindam Chowdhury, Massimiliano Lupo Pasini	2025-10-07	下载	Graph neural networks (GNNs) are widely used as surrogates for costly experiments and first-principles simulations to study the behavior of compounds at atomistic scale, and their architectural comple...
Toward Systems Foundations for Agentic Exploration	Jiakai Xu, Tianle Zhou, Eugene Wu, Kostis Kaffes	2025-10-07	下载	Agentic exploration, letting LLM-powered agents branch, backtrack, and search across many execution paths, demands systems support well beyond today's pass-at-k resets.
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference	Zhongkai Yu, Yue Guan, Zihao Yu, Chenyang Zhou, Zhengding Hu, Shuyi Pei, Yangwook Kang, Yufei Ding, Po-An Tsai	2025-10-07	下载	Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier open weight models, achieving remarkable model capability similar to proprietary ones.
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node Communications	Xi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, Dong Li	2025-10-07	下载	Message Passing Interface (MPI) is a foundational programming model for high-performance computing. MPI libraries traditionally employ network interconnects (e.g.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Dynamic Scheduling in Fiber and Spaceborne Quantum Repeater Networks	Paolo Fittipaldi	2025-10-07	下载	The problem of scheduling in quantum networks amounts to choosing which entanglement swapping operations to perform to better serve user demand.
Leveraging Generative AI for large-scale prediction-based networking	Mathias Thorsager, Israel Leyva-Mayorga, Petar Popovski	2025-10-07	下载	The traditional role of the network layer is to create an end-to-end route, through which the intermediate nodes replicate and forward the packets towards the destination.
A Deep Q-Network based power control mechanism to Minimize RLF driven Handover Failure in 5G Network	Kotha Kartheek, Shankar K. Ghosh, Megha Iyengar, Vinod Sharma, Souvik Deb	2025-10-07	下载	The impact of Radio link failure (RLF) has been largely ignored in designing handover algorithms, although RLF is a major contributor towards causing handover failure (HF).
On Enhancing Delay SLAs in TCP Networks through Joint Routing and Transport Assistant Deployment	José Gómez-delaHiz, Mohamed Faten Zhani, Jaime Galán-Jiménez, John Kaippallimalil	2025-10-07	下载	The Transport Control Protocol has long been the primary transport protocol for applications requiring performance and reliability over the Internet.
Generative AI-Driven Hierarchical Multi-Agent Framework for Zero-Touch Optical Networks	Yao Zhang, Yuchen Song, Shengnan Li, Yan Shi, Shikui Shen, Xiongyan Tang, Min Zhang, Danshi Wang	2025-10-07	下载	The rapid development of Generative Artificial Intelligence (GenAI) has catalyzed a transformative technological revolution across all walks of life.
XAI-on-RAN: Explainable, AI-native, and GPU-Accelerated RAN Towards 6G	Osman Tugay Basaran, Falko Dressler	2025-10-07	下载	Artificial intelligence (AI)-native radio access networks (RANs) will serve vertical industries with stringent requirements: smart grids, autonomous vehicles, remote healthcare, industrial automation,...
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node Communications	Xi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, Dong Li	2025-10-07	下载	Message Passing Interface (MPI) is a foundational programming model for high-performance computing. MPI libraries traditionally employ network interconnects (e.g.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Toward Systems Foundations for Agentic Exploration	Jiakai Xu, Tianle Zhou, Eugene Wu, Kostis Kaffes	2025-10-07	下载	Agentic exploration, letting LLM-powered agents branch, backtrack, and search across many execution paths, demands systems support well beyond today's pass-at-k resets.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Adaptive Protein Design Protocols and Middleware	Aymen Alsaadi, Jonathan Ash, Mikhail Titov, Matteo Turilli, Andre Merzky, Shantenu Jha, Sagar Khare	2025-10-07	下载	Computational protein design is experiencing a transformation driven by AI/ML. However, the range of potential protein sequences and structures is astronomically vast, even for moderately sized protei...
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models	Haoxin Wang, Xiaolong Tu, Hongyu Ke, Huirong Chai, Dawei Chen, Kyungtae Han	2025-10-07	下载	Large Language Models (LLMs) are increasingly integrated into everyday applications, but their prevalent cloud-based deployment raises growing concerns around data privacy and long-term sustainability...
Speeding up SQL subqueries via decoupling of non-correlated predicate (extended version)	Dmitrii Radivonchik, Yakov Kuzin, Anton Chizhov, Dmitriy Shcheka, Mikhail Firsov, Kirill Smirnov, George Chernishev	2025-10-07	下载	In this paper, we discuss a novel technique for processing correlated subqueries in SQL. The core idea is to isolate the non-correlated part of the predicate and use it to reduce the number of evaluat...