Appearance
2025-07-03
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Hey AI, Generate Me a Hardware Code! Agentic AI-based Hardware Design & Verification | Deepak Narayan Gadde, Keerthan Kopparam Radhakrishna, Vaisakh Naduvodi Viswambharan, Aman Kumar, Djones Lettnin, Wolfgang Kunz, Sebastian Simon | 2025-07-03 | 下载 | Modern Integrated Circuits (ICs) are becoming increasingly complex, and so is their development process. Hardware design verification entails a methodical and disciplined approach to the planning, dev... |
| Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference Infrastructure | Rui Xie, Asad Ul Haq, Yunhua Fang, Linsen Ma, Sanchari Sen, Swagath Venkataramani, Liu Liu, Tong Zhang | 2025-07-03 | 下载 | High-Bandwidth Memory (HBM) delivers exceptional bandwidth and energy efficiency for AI workloads, but its high cost per bit, driven in part by stringent on-die reliability requirements, poses a growi... |
| AC-Refiner: Efficient Arithmetic Circuit Optimization Using Conditional Diffusion Models | Chenhao Xue, Kezhi Li, Jiaxing Zhang, Yi Ren, Zhengyuan Shi, Chen Zhang, Yibo Lin, Lining Zhang, Qiang Xu, Guangyu Sun | 2025-07-03 | 下载 | Arithmetic circuits, such as adders and multipliers, are fundamental components of digital systems, directly impacting the performance, power efficiency, and area footprint. |
| System-performance and cost modeling of Large Language Model training and inference | Wenzhe Guo, Joyjit Kundu, Uras Tos, Weijiang Kong, Giuliano Sisto, Timon Evenblij, Manu Perumkunnil | 2025-07-03 | 下载 | Large language models (LLMs), based on transformer architectures, have revolutionized numerous domains within artificial intelligence, science, and engineering due to their exceptional scalability and... |
| DecoRTL: A Run-time Decoding Framework for RTL Code Generation with LLMs | Mohammad Akyash, Kimia Azar, Hadi Kamali | 2025-07-03 | 下载 | As one of their many applications, large language models (LLMs) have recently shown promise in automating register transfer level (RTL) code generation. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Symbiosis: Multi-Adapter Inference and Fine-Tuning | Saransh Gupta, Umesh Deshpande, Travis Janssen, Swami Sundararaman | 2025-07-03 | 下载 | Parameter-efficient fine-tuning (PEFT) allows model builders to capture the task-specific parameters into adapters, which are a fraction of the size of the original base model. |
| Collective Communication Profiling of Modern-day Machine Learning Workloads | Jit Gupta, Andrew Li, Tarun Banka, Ariel Cohen, T. Sridhar, Raj Yavatkar | 2025-07-03 | 下载 | Machine Learning jobs, carried out on large number of distributed high performance systems, involve periodic communication using operations like AllReduce, AllGather, and Broadcast. |
| BLaST: High Performance Inference and Pretraining using BLock Sparse Transformers | Patrik Okanovic, Sameer Deshmukh, Grzegorz Kwasniewski, Yi Zhu, Haruto Fujii, Sakina Fatima, Maciej Besta, Kentaro Katayama, Takumi Honda, Yusuke Nagasaka, Torsten Hoefler | 2025-07-03 | 下载 | The energy consumption of large-scale ML models is dominated by data movement, shuffling billions of parameters across memory hierarchies and data centers. |
| Characterizing Compute-Communication Overlap in GPU-Accelerated Distributed Deep Learning: Performance and Power Implications | Seonho Lee, Jihwan Oh, Junkyum Kim, Seokjin Go, Jongse Park, Divya Mahajan | 2025-07-03 | 下载 | This paper provides an in-depth characterization of GPU-accelerated systems, to understand the interplay between overlapping computation and communication which is commonly employed in distributed tra... |
| FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference | Xing Liu, Lizhuo Luo, Ming Tang, Chao Huang, Xu Chen | 2025-07-03 | 下载 | Distributed inference serves as a promising approach to enabling the inference of large language models (LLMs) at the network edge. It distributes the inference process to multiple devices to ensure t... |
| MULTI-SCOUT: Multistatic Integrated Sensing and Communications in 5G and Beyond for Moving Target Detection, Positioning, and Tracking | Yalin E. Sagduyu, Kemal Davaslioglu, Tugba Erpek, Sastry Kompella, Gustave Anderson, Jonathan Ashdown | 2025-07-03 | 下载 | This paper presents a complete signal-processing chain for multistatic integrated sensing and communications (ISAC) using 5G Positioning Reference Signal (PRS). |
| Analysing semantic data storage in Distributed Ledger Technologies for Data Spaces | Juan Cano-Benito, Andrea Cimmino, Sven Hertling, Heiko Paulheim, Raúl García-Castro | 2025-07-03 | 下载 | Data spaces are emerging as decentralised infrastructures that enable sovereign, secure, and trustworthy data exchange among multiple participants. |
| Resolving CAP Through Automata-Theoretic Economic Design: A Unified Mathematical Framework for Real-Time Partition-Tolerant Systems | Craig S Wright | 2025-07-03 | 下载 | The CAP theorem asserts a trilemma between consistency, availability, and partition tolerance. This paper introduces a rigorous automata-theoretic and economically grounded framework that reframes the... |
| Red grape detection with accelerated artificial neural networks in the FPGA's programmable logic | Sandro Costa Magalhães, Marco Almeida, Filipe Neves dos Santos, António Paulo Moreira, Jorge Dias | 2025-07-03 | 下载 | Robots usually slow down for canning to detect objects while moving. Additionally, the robot's camera is configured with a low framerate to track the velocity of the detection algorithms. |
| Alps, a versatile research infrastructure | Maxime Martinasso, Mark Klein, Thomas C. Schulthess | 2025-07-03 | 下载 | The Swiss National Supercomputing Centre (CSCS) has a long-standing tradition of delivering top-tier high-performance computing systems, exemplified by the Piz Daint supercomputer. |
| On the Inference (In-)Security of Vertical Federated Learning: Efficient Auditing against Inference Tampering Attack | Chung-ju Huang, Ziqi Zhang, Yinggui Wang, Binghui Wang, Tao Wei, Leye Wang | 2025-07-03 | 下载 | Vertical Federated Learning (VFL) is an emerging distributed learning paradigm for cross-silo collaboration without accessing participants' data. |
| Flotilla: A scalable, modular and resilient federated learning framework for heterogeneous resources | Roopkatha Banerjee, Prince Modi, Jinal Vyas, Chunduru Sri Abhijit, Tejus Chandrashekar, Harsha Varun Marisetty, Manik Gupta, Yogesh Simmhan | 2025-07-03 | 下载 | With the recent improvements in mobile and edge computing and rising concerns of data privacy, Federated Learning(FL) has rapidly gained popularity as a privacy-preserving, distributed machine learnin... |
| Domain-Adversarial Transfer Learning for Fault Root Cause Identification in Cloud Computing Systems | Bruce Fang, Danyi Gao | 2025-07-03 | 下载 | This paper addresses the challenge of fault root cause identification in cloud computing environments. The difficulty arises from complex system structures, dense service coupling, and limited fault i... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| RCA Copilot: Transforming Network Data into Actionable Insights via Large Language Models | Alexander Shan, Jasleen Kaur, Rahul Singh, Tarun Banka, Raj Yavatkar, T. Sridhar | 2025-07-03 | 下载 | Ensuring the reliability and availability of complex networked services demands effective root cause analysis (RCA) across cloud environments, data centers, and on-premises networks. |
| Collective Communication Profiling of Modern-day Machine Learning Workloads | Jit Gupta, Andrew Li, Tarun Banka, Ariel Cohen, T. Sridhar, Raj Yavatkar | 2025-07-03 | 下载 | Machine Learning jobs, carried out on large number of distributed high performance systems, involve periodic communication using operations like AllReduce, AllGather, and Broadcast. |
| An End-to-End Assurance Framework for AI/ML Workloads in Datacenters | Jit Gupta, Tarun Banka, Rahul Gupta, Mithun Dharmaraj, Jasleen Kaur | 2025-07-03 | 下载 | Modern machine learning workloads such as large language model training, fine-tuning jobs are highly distributed and span across hundreds of systems with multiple GPUs. |
| DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift | Po-Heng Chou, Ching-Wen Chen, Wan-Jen Huang, Walid Saad, Yu Tsao, Ronald Y. Chang | 2025-07-03 | 下载 | In this paper, the precoding design is investigated for maximizing the throughput of millimeter wave (mmWave) multiple-input multiple-output (MIMO) systems with obstructed direct communication paths. |
| On the Architectural Split and Radio Intelligence Controller Placement in Integrated O-RAN-enabled Non-Terrestrial Networks | Jorge Baranda, Marius Caus, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan, Kapal Dev, Zheng Li, Tomaso DeCola | 2025-07-03 | 下载 | The integration of Terrestrial Networks (TNs) with Non-Terrestrial Networks (NTNs) poses unique architectural and functional challenges due to heterogeneous propagation conditions, dynamic topologies ... |
| MULTI-SCOUT: Multistatic Integrated Sensing and Communications in 5G and Beyond for Moving Target Detection, Positioning, and Tracking | Yalin E. Sagduyu, Kemal Davaslioglu, Tugba Erpek, Sastry Kompella, Gustave Anderson, Jonathan Ashdown | 2025-07-03 | 下载 | This paper presents a complete signal-processing chain for multistatic integrated sensing and communications (ISAC) using 5G Positioning Reference Signal (PRS). |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Access Control Threatened by Quantum Entanglement | Zhicheng Zhang, Mingsheng Ying | 2025-07-03 | 下载 | Access control is a cornerstone of computer security that prevents unauthorised access to resources. In this paper, we study access control in quantum computer systems. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| DistZO2: High-Throughput and Memory-Efficient Zeroth-Order Fine-tuning LLMs with Distributed Parallel Computing | Liangyu Wang, Huanyi Xie, Di Wang | 2025-07-03 | 下载 | Fine-tuning large language models (LLMs) remains resource-intensive due to their sheer scale. While zeroth-order (ZO) optimization provides a memory-efficient alternative by eliminating backward passe... |