2025-07-09

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure	Myoungsoo Jung	2025-07-09	下载	Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility.
Opto-ViT: Architecting a Near-Sensor Region of Interest-Aware Vision Transformer Accelerator with Silicon Photonics	Mehrdad Morsali, Chengwei Zhou, Deniz Najafi, Sreetama Sarkar, Pietro Mercati, Navid Khoshavi, Peter Beerel, Mahdi Nikdast, Gourav Datta, Shaahin Angizi	2025-07-09	下载	Vision Transformers (ViTs) have emerged as a powerful architecture for computer vision tasks due to their ability to model long-range dependencies and global contextual relationships.
VerilogDB: The Largest, Highest-Quality Dataset with a Preprocessing Framework for LLM-based RTL Generation	Paul E. Calzada, Zahin Ibnat, Tanvir Rahman, Kamal Kandula, Danyu Lu, Sujan Kumar Saha, Farimah Farahmandi, Mark Tehranipoor	2025-07-09	下载	Large Language Models (LLMs) are gaining popularity for hardware design automation, particularly through Register Transfer Level (RTL) code generation.
Deep-Learning-Based Pre-Layout Parasitic Capacitance Prediction on SRAM Designs	Shan Shen, Dingcheng Yang, Yuyang Xie, Chunyan Pei, Wenjian Yu, Bei Yu	2025-07-09	下载	To achieve higher system energy efficiency, SRAM in SoCs is often customized. The parasitic effects cause notable discrepancies between pre-layout and post-layout circuit simulations, leading to diffi...
Towards LLM-based Root Cause Analysis of Hardware Design Failures	Siyu Qiu, Muzhi Wang, Raheel Afsharmazayejani, Mohammad Moradi Shahmiri, Benjamin Tan, Hammond Pearce	2025-07-09	下载	With advances in large language models (LLMs), new opportunities have emerged to develop tools that support the digital hardware design process.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure	Myoungsoo Jung	2025-07-09	下载	Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility.
A Survey on Bilateral Multi-Round Cloud-SLA Negotiation Strategies	Benedikt Pittl, Werner Mach, Erich Schikuta	2025-07-09	下载	Today, static cloud markets where consumers purchase services directly from providers are dominating. Thus, consumers neither negotiate the price nor the characteristics of the service.
Accelerated Spatio-Temporal Bayesian Modeling for Multivariate Gaussian Processes	Lisa Gaedke-Merzhäuser, Vincent Maillou, Fernando Rodriguez Avellaneda, Olaf Schenk, Mathieu Luisier, Paula Moraga, Alexandros Nikolaos Ziogas, Håvard Rue	2025-07-09	下载	Multivariate Gaussian processes (GPs) offer a powerful probabilistic framework to represent complex interdependent phenomena. They pose, however, significant computational challenges in high-dimension...
DICE: Data Influence Cascade in Decentralized Learning	Tongtian Zhu, Wenhao Li, Can Wang, Fengxiang He	2025-07-09	下载	Decentralized learning offers a promising approach to crowdsource data consumptions and computational workloads across geographically distributed compute interconnected through peer-to-peer networks, ...
Towards Efficient and Scalable Distributed Vector Search with RDMA	Xiangyu Zhi, Meng Chen, Xiao Yan, Baotong Lu, Hui Li, Qianxi Zhang, Qi Chen, James Cheng	2025-07-09	下载	Similarity-based vector search facilitates many important applications such as search and recommendation but is limited by the memory capacity and bandwidth of a single machine due to large datasets a...
Nexus:Proactive Intra-GPU Disaggregation of Prefill and Decode in LLM Serving	Xiaoxiang Shi, Colin Cai, Junjia Du, Zhihao Jia	2025-07-09	下载	Monolithic serving with chunked prefill improves GPU utilization by batching prefill and decode together, but suffers from fine-grained phase interference.
M $^2$ -MFP: A Multi-Scale and Multi-Level Memory Failure Prediction Framework for Reliable Cloud Infrastructure	Hongyi Xie, Min Zhou, Qiao Yu, Jialiang Yu, Zhenli Sheng, Hong Xie, Defu Lian	2025-07-09	下载	As cloud services become increasingly integral to modern IT infrastructure, ensuring hardware reliability is essential to sustain high-quality service.
SlimCaching: Edge Caching of Mixture-of-Experts for Distributed Inference	Qian Chen, Xianhao Chen, Kaibin Huang	2025-07-09	下载	Mixture-of-Experts (MoE) models improve the scalability of large language models (LLMs) by activating only a small subset of relevant experts per input.
On the Surprising Effectiveness of a Single Global Merging in Decentralized Learning	Tongtian Zhu, Tianyu Zhang, Mingze Wang, Zhanpeng Zhou, Can Wang	2025-07-09	下载	Decentralized learning provides a scalable alternative to parameter-server-based training, yet its performance is often hindered by limited peer-to-peer communication.
The AI Shadow War: SaaS vs. Edge Computing Architectures	Rhea Pritham Marpu, Kevin J McNamara, Preeti Gupta	2025-07-09	下载	The very DNA of AI architecture presents conflicting paths: centralized cloud-based models (Software-as-a-Service) versus decentralized edge AI (local processing on consumer devices).
Designing Parallel Algorithms for Community Detection using Arachne	Fuhuan Li, Zhihui Du, David A. Bader	2025-07-09	下载	The rise of graph data in various fields calls for efficient and scalable community detection algorithms. In this paper, we present parallel implementations of two widely used algorithms: Label Propag...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Federated Learning-based MARL for Strengthening Physical-Layer Security in B5G Networks	Deemah H. Tashman, Soumaya Cherkaoui, Walaa Hamouda	2025-07-09	下载	This paper explores the application of a federated learning-based multi-agent reinforcement learning (MARL) strategy to enhance physical-layer security (PLS) in a multi-cellular network within the con...
Maximizing Reliability in Overlay Radio Networks with Time Switching and Power Splitting Energy Harvesting	Deemah H. Tashman, Soumaya Cherkaoui, Walaa Hamouda	2025-07-09	下载	Cognitive radio networks (CRNs) are acknowledged for their ability to tackle the issue of spectrum under-utilization. In the realm of CRNs, this paper investigates the energy efficiency issue and addr...
Optimizing Cognitive Networks: Reinforcement Learning Meets Energy Harvesting Over Cascaded Channels	Deemah H. Tashman, Soumaya Cherkaoui, Walaa Hamouda	2025-07-09	下载	This paper presents a reinforcement learning (RL) based approach to improve the physical layer security (PLS) of an underlay cognitive radio network (CRN) over cascaded channels.
Beyond Connectivity: An Open Architecture for AI-RAN Convergence in 6G	Michele Polese, Niloofar Mohamadi, Salvatore D'Oro, Leonardo Bonati, Tommaso Melodia	2025-07-09	下载	Data-intensive Artificial Intelligence (AI) applications at the network edge demand a fundamental shift in Radio Access Network (RAN) design, from merely consuming AI for network optimization, to acti...
Connecting the Unconnected -- Sentiment Analysis of Field Survey of Internet Connectivity in Emerging Economies	Dibakar Das, Barath S Narayan, Aarna Bhammar, Jyotsna Bapat	2025-07-09	下载	Internet has significantly improved the quality of citizens across the world. Though the internet coverage is quite high, 40% of global population do not have access to broadband internet.
DAF: An Efficient End-to-End Dynamic Activation Framework for on-Device DNN Training	Renyuan Liu, Yuyang Leng, Kaiyan Liu, Shaohan Hu, Chun-Fu, Chen, Peijun Zhao, Heechul Yun, Shuochao Yao	2025-07-09	下载	Recent advancements in on-device training for deep neural networks have underscored the critical need for efficient activation compression to overcome the memory constraints of mobile and edge devices...
Stacked Intelligent Metasurfaces-Aided eVTOL Delay Sensitive Communications	Liyuan Chen, Kai Xiong, Yujie Qin, Hanqing Yu, Supeng Leng, Chau Yuen	2025-07-09	下载	With rapid urbanization and increasing population density, urban traffic congestion has become a critical issue, and traditional ground transportation methods are no longer sufficient to address it ef...
SlimCaching: Edge Caching of Mixture-of-Experts for Distributed Inference	Qian Chen, Xianhao Chen, Kaibin Huang	2025-07-09	下载	Mixture-of-Experts (MoE) models improve the scalability of large language models (LLMs) by activating only a small subset of relevant experts per input.
Learning To Communicate Over An Unknown Shared Network	Shivangi Agarwal, Adi Asija, Sanjit K. Kaul, Arani Bhattacharya, Saket Anand	2025-07-09	下载	As robots (edge-devices, agents) find uses in an increasing number of settings and edge-cloud resources become pervasive, wireless networks will often be shared by flows of data traffic that result fr...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Uncertainty Quantification as a Complementary Latent Health Indicator for Remaining Useful Life Prediction on Turbofan Engines	Lucas Thil, Jesse Read, Rim Kaddah, Guillaume Florent Doquet	2025-07-09	下载	Health Indicators (HIs) are essential for predicting system failures in predictive maintenance. While methods like RaPP (Reconstruction along Projected Pathways) improve traditional HI approaches by l...