Skip to content

2025-10-07

cs.AR - Architecture

标题作者发布日期PDF摘要
On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost ApproachDebendra Das Sharma, Swadesh Choudhary, Peter Onufryk, Rob Pelt2025-10-07下载Emerging computing applications such as Artificial Intelligence (AI) are facing a memory wall with existing on-package memory solutions that are unable to meet the power-efficient bandwidth demands.
An opportunity to improve Data Center Efficiency: Optimizing the Server's Upgrade CyclePanagiota Nikolaou, Freddy Gabbay, Jawad Haj-Yahya, Yiannakis Sazeides2025-10-07下载This work aims to improve a data center's efficiency by optimizing the server upgrade plan: determine the optimal timing for replacing old servers with new ones.
Simopt-Power: Leveraging Simulation Metadata for Low-Power Design SynthesisEashan Wadhwa, Shanker Shreejith2025-10-07下载Excessive switching activity is a primary contributor to dynamic power dissipation in modern FPGAs, where fine-grained configurability amplifies signal toggling and associated capacitance.
From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUsTianhao Zhu, Dahu Feng, Erhu Feng, Yubin Xia2025-10-07下载With the widespread adoption of Large Language Models (LLMs), the demand for high-performance LLM inference services continues to grow. To meet this demand, a growing number of AI accelerators have be...
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM InferenceZhongkai Yu, Yue Guan, Zihao Yu, Chenyang Zhou, Zhengding Hu, Shuyi Pei, Yangwook Kang, Yufei Ding, Po-An Tsai2025-10-07下载Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier open weight models, achieving remarkable model capability similar to proprietary ones.
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node CommunicationsXi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, Dong Li2025-10-07下载Message Passing Interface (MPI) is a foundational programming model for high-performance computing. MPI libraries traditionally employ network interconnects (e.g.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost ApproachDebendra Das Sharma, Swadesh Choudhary, Peter Onufryk, Rob Pelt2025-10-07下载Emerging computing applications such as Artificial Intelligence (AI) are facing a memory wall with existing on-package memory solutions that are unable to meet the power-efficient bandwidth demands.
Context-Aware Inference via Performance Forecasting in Decentralized Learning NetworksJoel Pfeffer, J. M. Diederik Kruijssen, Clément Gossart, Mélanie Chevance, Diego Campo Millan, Florian Stecker, Steven N. Longmore2025-10-07下载In decentralized learning networks, predictions from many participants are combined to generate a network inference. While many studies have demonstrated performance benefits of combining multiple mod...
MuFASA -- Asynchronous Checkpoint for Weakly Consistent Fully Replicated DatabasesRaaghav Ravishankar, Sandeep Kulkarni, Nitin H Vaidya2025-10-07下载We focus on the problem of checkpointing in fully replicated weakly consistent distributed databases, which we refer to as Distributed Transaction Consistent Snapshot (DTCS).
Adaptive Protein Design Protocols and MiddlewareAymen Alsaadi, Jonathan Ash, Mikhail Titov, Matteo Turilli, Andre Merzky, Shantenu Jha, Sagar Khare2025-10-07下载Computational protein design is experiencing a transformation driven by AI/ML. However, the range of potential protein sequences and structures is astronomically vast, even for moderately sized protei...
Transforming Lock-free Linked Lists into Distributed Lock-free Linked ListsRaaghav Ravishankar, Sandeep Kulkarni, Sathya Peri, Gokarna Sharma2025-10-07下载Modern databases use dynamic search structures that store an enormous amount of data, and often serve them using multi-threaded algorithms to support the ever-increasing throughput needs.
Optimal Good-Case Latency for Sleepy ConsensusYuval Efron, Joachim Neu, Ling Ren, Ertem Nusret Tas2025-10-07下载In the context of Byzantine consensus problems such as Byzantine broadcast (BB) and Byzantine agreement (BA), the good-case setting aims to study the minimal possible latency of a BB or BA protocol un...
How many more is different?Jacob Calvert, Andréa W. Richa, Dana Randall2025-10-07下载From the formation of ice in small clusters of water molecules to the mass raids of army ant colonies, the emergent behavior of collectives depends critically on their size.
EARL: Efficient Agentic Reinforcement Learning Systems for Large Language ModelsZheyue Tan, Mustapha Abdullahi, Tuo Shi, Huining Yuan, Zelai Xu, Chao Yu, Boxun Li, Bo Zhao2025-10-07下载Reinforcement learning (RL) has become a pivotal component of large language model (LLM) post-training, and agentic RL extends this paradigm to operate as agents through multi-turn interaction and too...
The Anatomy of a Triton Attention KernelBurkhard Ringlein, Jan van Lunteren, Radu Stoica, Thomas Parnell2025-10-07下载A long-standing goal in both industry and academia is to develop an LLM inference platform that is portable across hardware architectures, eliminates the need for low-level hand-tuning, and still deli...
A Review of Ontology-Driven Big Data Analytics in Healthcare: Challenges, Tools, and ApplicationsRitesh Chandra, Sonali Agarwal, Navjot Singh, Sadhana Tiwari2025-10-07下载Exponential growth in heterogeneous healthcare data arising from electronic health records (EHRs), medical imaging, wearable sensors, and biomedical research has accelerated the adoption of data lakes...
Intertemporal Pricing of Time-Bound Stablecoins: Measuring and Controlling the Liquidity-of-Time PremiumAiliya Borjigin, Cong He2025-10-07下载Time-bound stablecoins are DeFi assets that temporarily tokenize traditional securities during market off-hours, enabling continuous cross-market liquidity.
Simopt-Power: Leveraging Simulation Metadata for Low-Power Design SynthesisEashan Wadhwa, Shanker Shreejith2025-10-07下载Excessive switching activity is a primary contributor to dynamic power dissipation in modern FPGAs, where fine-grained configurability amplifies signal toggling and associated capacitance.
Decoupling Correctness from Policy: A Deterministic Causal Structure for Multi-Agent SystemsZhiyuan Ren, Tao Zhang, Wenchi Chen2025-10-07下载In distributed multi-agent systems, correctness is often entangled with operational policies such as scheduling, batching, or routing, which makes systems brittle since performance-driven policy evolu...
When Does Global Attention Help? A Unified Empirical Study on Atomistic Graph LearningArindam Chowdhury, Massimiliano Lupo Pasini2025-10-07下载Graph neural networks (GNNs) are widely used as surrogates for costly experiments and first-principles simulations to study the behavior of compounds at atomistic scale, and their architectural comple...
Toward Systems Foundations for Agentic ExplorationJiakai Xu, Tianle Zhou, Eugene Wu, Kostis Kaffes2025-10-07下载Agentic exploration, letting LLM-powered agents branch, backtrack, and search across many execution paths, demands systems support well beyond today's pass-at-k resets.
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM InferenceZhongkai Yu, Yue Guan, Zihao Yu, Chenyang Zhou, Zhengding Hu, Shuyi Pei, Yangwook Kang, Yufei Ding, Po-An Tsai2025-10-07下载Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier open weight models, achieving remarkable model capability similar to proprietary ones.
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node CommunicationsXi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, Dong Li2025-10-07下载Message Passing Interface (MPI) is a foundational programming model for high-performance computing. MPI libraries traditionally employ network interconnects (e.g.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Dynamic Scheduling in Fiber and Spaceborne Quantum Repeater NetworksPaolo Fittipaldi2025-10-07下载The problem of scheduling in quantum networks amounts to choosing which entanglement swapping operations to perform to better serve user demand.
Leveraging Generative AI for large-scale prediction-based networkingMathias Thorsager, Israel Leyva-Mayorga, Petar Popovski2025-10-07下载The traditional role of the network layer is to create an end-to-end route, through which the intermediate nodes replicate and forward the packets towards the destination.
A Deep Q-Network based power control mechanism to Minimize RLF driven Handover Failure in 5G NetworkKotha Kartheek, Shankar K. Ghosh, Megha Iyengar, Vinod Sharma, Souvik Deb2025-10-07下载The impact of Radio link failure (RLF) has been largely ignored in designing handover algorithms, although RLF is a major contributor towards causing handover failure (HF).
On Enhancing Delay SLAs in TCP Networks through Joint Routing and Transport Assistant DeploymentJosé Gómez-delaHiz, Mohamed Faten Zhani, Jaime Galán-Jiménez, John Kaippallimalil2025-10-07下载The Transport Control Protocol has long been the primary transport protocol for applications requiring performance and reliability over the Internet.
Generative AI-Driven Hierarchical Multi-Agent Framework for Zero-Touch Optical NetworksYao Zhang, Yuchen Song, Shengnan Li, Yan Shi, Shikui Shen, Xiongyan Tang, Min Zhang, Danshi Wang2025-10-07下载The rapid development of Generative Artificial Intelligence (GenAI) has catalyzed a transformative technological revolution across all walks of life.
XAI-on-RAN: Explainable, AI-native, and GPU-Accelerated RAN Towards 6GOsman Tugay Basaran, Falko Dressler2025-10-07下载Artificial intelligence (AI)-native radio access networks (RANs) will serve vertical industries with stringent requirements: smart grids, autonomous vehicles, remote healthcare, industrial automation,...
cMPI: Using CXL Memory Sharing for MPI One-Sided and Two-Sided Inter-Node CommunicationsXi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, Dong Li2025-10-07下载Message Passing Interface (MPI) is a foundational programming model for high-performance computing. MPI libraries traditionally employ network interconnects (e.g.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Toward Systems Foundations for Agentic ExplorationJiakai Xu, Tianle Zhou, Eugene Wu, Kostis Kaffes2025-10-07下载Agentic exploration, letting LLM-powered agents branch, backtrack, and search across many execution paths, demands systems support well beyond today's pass-at-k resets.

cs.PF - Performance

标题作者发布日期PDF摘要
Adaptive Protein Design Protocols and MiddlewareAymen Alsaadi, Jonathan Ash, Mikhail Titov, Matteo Turilli, Andre Merzky, Shantenu Jha, Sagar Khare2025-10-07下载Computational protein design is experiencing a transformation driven by AI/ML. However, the range of potential protein sequences and structures is astronomically vast, even for moderately sized protei...
lm-Meter: Unveiling Runtime Inference Latency for On-Device Language ModelsHaoxin Wang, Xiaolong Tu, Hongyu Ke, Huirong Chai, Dawei Chen, Kyungtae Han2025-10-07下载Large Language Models (LLMs) are increasingly integrated into everyday applications, but their prevalent cloud-based deployment raises growing concerns around data privacy and long-term sustainability...
Speeding up SQL subqueries via decoupling of non-correlated predicate (extended version)Dmitrii Radivonchik, Yakov Kuzin, Anton Chizhov, Dmitriy Shcheka, Mikhail Firsov, Kirill Smirnov, George Chernishev2025-10-07下载In this paper, we discuss a novel technique for processing correlated subqueries in SQL. The core idea is to isolate the non-correlated part of the predicate and use it to reduce the number of evaluat...

基于 VitePress 构建