Skip to content

2025-07-09

cs.AR - Architecture

标题作者发布日期PDF摘要
Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI InfrastructureMyoungsoo Jung2025-07-09下载Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility.
Opto-ViT: Architecting a Near-Sensor Region of Interest-Aware Vision Transformer Accelerator with Silicon PhotonicsMehrdad Morsali, Chengwei Zhou, Deniz Najafi, Sreetama Sarkar, Pietro Mercati, Navid Khoshavi, Peter Beerel, Mahdi Nikdast, Gourav Datta, Shaahin Angizi2025-07-09下载Vision Transformers (ViTs) have emerged as a powerful architecture for computer vision tasks due to their ability to model long-range dependencies and global contextual relationships.
VerilogDB: The Largest, Highest-Quality Dataset with a Preprocessing Framework for LLM-based RTL GenerationPaul E. Calzada, Zahin Ibnat, Tanvir Rahman, Kamal Kandula, Danyu Lu, Sujan Kumar Saha, Farimah Farahmandi, Mark Tehranipoor2025-07-09下载Large Language Models (LLMs) are gaining popularity for hardware design automation, particularly through Register Transfer Level (RTL) code generation.
Deep-Learning-Based Pre-Layout Parasitic Capacitance Prediction on SRAM DesignsShan Shen, Dingcheng Yang, Yuyang Xie, Chunyan Pei, Wenjian Yu, Bei Yu2025-07-09下载To achieve higher system energy efficiency, SRAM in SoCs is often customized. The parasitic effects cause notable discrepancies between pre-layout and post-layout circuit simulations, leading to diffi...
Towards LLM-based Root Cause Analysis of Hardware Design FailuresSiyu Qiu, Muzhi Wang, Raheel Afsharmazayejani, Mohammad Moradi Shahmiri, Benjamin Tan, Hammond Pearce2025-07-09下载With advances in large language models (LLMs), new opportunities have emerged to develop tools that support the digital hardware design process.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Compute Can't Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI InfrastructureMyoungsoo Jung2025-07-09下载Modern AI workloads such as large language models (LLMs) and retrieval-augmented generation (RAG) impose severe demands on memory, communication bandwidth, and resource flexibility.
A Survey on Bilateral Multi-Round Cloud-SLA Negotiation StrategiesBenedikt Pittl, Werner Mach, Erich Schikuta2025-07-09下载Today, static cloud markets where consumers purchase services directly from providers are dominating. Thus, consumers neither negotiate the price nor the characteristics of the service.
Accelerated Spatio-Temporal Bayesian Modeling for Multivariate Gaussian ProcessesLisa Gaedke-Merzhäuser, Vincent Maillou, Fernando Rodriguez Avellaneda, Olaf Schenk, Mathieu Luisier, Paula Moraga, Alexandros Nikolaos Ziogas, Håvard Rue2025-07-09下载Multivariate Gaussian processes (GPs) offer a powerful probabilistic framework to represent complex interdependent phenomena. They pose, however, significant computational challenges in high-dimension...
DICE: Data Influence Cascade in Decentralized LearningTongtian Zhu, Wenhao Li, Can Wang, Fengxiang He2025-07-09下载Decentralized learning offers a promising approach to crowdsource data consumptions and computational workloads across geographically distributed compute interconnected through peer-to-peer networks, ...
Towards Efficient and Scalable Distributed Vector Search with RDMAXiangyu Zhi, Meng Chen, Xiao Yan, Baotong Lu, Hui Li, Qianxi Zhang, Qi Chen, James Cheng2025-07-09下载Similarity-based vector search facilitates many important applications such as search and recommendation but is limited by the memory capacity and bandwidth of a single machine due to large datasets a...
Nexus:Proactive Intra-GPU Disaggregation of Prefill and Decode in LLM ServingXiaoxiang Shi, Colin Cai, Junjia Du, Zhihao Jia2025-07-09下载Monolithic serving with chunked prefill improves GPU utilization by batching prefill and decode together, but suffers from fine-grained phase interference.
M2^2-MFP: A Multi-Scale and Multi-Level Memory Failure Prediction Framework for Reliable Cloud InfrastructureHongyi Xie, Min Zhou, Qiao Yu, Jialiang Yu, Zhenli Sheng, Hong Xie, Defu Lian2025-07-09下载As cloud services become increasingly integral to modern IT infrastructure, ensuring hardware reliability is essential to sustain high-quality service.
SlimCaching: Edge Caching of Mixture-of-Experts for Distributed InferenceQian Chen, Xianhao Chen, Kaibin Huang2025-07-09下载Mixture-of-Experts (MoE) models improve the scalability of large language models (LLMs) by activating only a small subset of relevant experts per input.
On the Surprising Effectiveness of a Single Global Merging in Decentralized LearningTongtian Zhu, Tianyu Zhang, Mingze Wang, Zhanpeng Zhou, Can Wang2025-07-09下载Decentralized learning provides a scalable alternative to parameter-server-based training, yet its performance is often hindered by limited peer-to-peer communication.
The AI Shadow War: SaaS vs. Edge Computing ArchitecturesRhea Pritham Marpu, Kevin J McNamara, Preeti Gupta2025-07-09下载The very DNA of AI architecture presents conflicting paths: centralized cloud-based models (Software-as-a-Service) versus decentralized edge AI (local processing on consumer devices).
Designing Parallel Algorithms for Community Detection using ArachneFuhuan Li, Zhihui Du, David A. Bader2025-07-09下载The rise of graph data in various fields calls for efficient and scalable community detection algorithms. In this paper, we present parallel implementations of two widely used algorithms: Label Propag...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Federated Learning-based MARL for Strengthening Physical-Layer Security in B5G NetworksDeemah H. Tashman, Soumaya Cherkaoui, Walaa Hamouda2025-07-09下载This paper explores the application of a federated learning-based multi-agent reinforcement learning (MARL) strategy to enhance physical-layer security (PLS) in a multi-cellular network within the con...
Maximizing Reliability in Overlay Radio Networks with Time Switching and Power Splitting Energy HarvestingDeemah H. Tashman, Soumaya Cherkaoui, Walaa Hamouda2025-07-09下载Cognitive radio networks (CRNs) are acknowledged for their ability to tackle the issue of spectrum under-utilization. In the realm of CRNs, this paper investigates the energy efficiency issue and addr...
Optimizing Cognitive Networks: Reinforcement Learning Meets Energy Harvesting Over Cascaded ChannelsDeemah H. Tashman, Soumaya Cherkaoui, Walaa Hamouda2025-07-09下载This paper presents a reinforcement learning (RL) based approach to improve the physical layer security (PLS) of an underlay cognitive radio network (CRN) over cascaded channels.
Beyond Connectivity: An Open Architecture for AI-RAN Convergence in 6GMichele Polese, Niloofar Mohamadi, Salvatore D'Oro, Leonardo Bonati, Tommaso Melodia2025-07-09下载Data-intensive Artificial Intelligence (AI) applications at the network edge demand a fundamental shift in Radio Access Network (RAN) design, from merely consuming AI for network optimization, to acti...
Connecting the Unconnected -- Sentiment Analysis of Field Survey of Internet Connectivity in Emerging EconomiesDibakar Das, Barath S Narayan, Aarna Bhammar, Jyotsna Bapat2025-07-09下载Internet has significantly improved the quality of citizens across the world. Though the internet coverage is quite high, 40% of global population do not have access to broadband internet.
DAF: An Efficient End-to-End Dynamic Activation Framework for on-Device DNN TrainingRenyuan Liu, Yuyang Leng, Kaiyan Liu, Shaohan Hu, Chun-Fu, Chen, Peijun Zhao, Heechul Yun, Shuochao Yao2025-07-09下载Recent advancements in on-device training for deep neural networks have underscored the critical need for efficient activation compression to overcome the memory constraints of mobile and edge devices...
Stacked Intelligent Metasurfaces-Aided eVTOL Delay Sensitive CommunicationsLiyuan Chen, Kai Xiong, Yujie Qin, Hanqing Yu, Supeng Leng, Chau Yuen2025-07-09下载With rapid urbanization and increasing population density, urban traffic congestion has become a critical issue, and traditional ground transportation methods are no longer sufficient to address it ef...
SlimCaching: Edge Caching of Mixture-of-Experts for Distributed InferenceQian Chen, Xianhao Chen, Kaibin Huang2025-07-09下载Mixture-of-Experts (MoE) models improve the scalability of large language models (LLMs) by activating only a small subset of relevant experts per input.
Learning To Communicate Over An Unknown Shared NetworkShivangi Agarwal, Adi Asija, Sanjit K. Kaul, Arani Bhattacharya, Saket Anand2025-07-09下载As robots (edge-devices, agents) find uses in an increasing number of settings and edge-cloud resources become pervasive, wireless networks will often be shared by flows of data traffic that result fr...

cs.PF - Performance

标题作者发布日期PDF摘要
Uncertainty Quantification as a Complementary Latent Health Indicator for Remaining Useful Life Prediction on Turbofan EnginesLucas Thil, Jesse Read, Rim Kaddah, Guillaume Florent Doquet2025-07-09下载Health Indicators (HIs) are essential for predicting system failures in predictive maintenance. While methods like RaPP (Reconstruction along Projected Pathways) improve traditional HI approaches by l...

基于 VitePress 构建