2025-07-07

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Bit-Flip Fault Attack: Crushing Graph Neural Networks via Gradual Bit Search	Sanaz Kazemi Abharian, Sai Manoj Pudukotai Dinakarrao	2025-07-07	下载	Graph Neural Networks (GNNs) have emerged as a powerful machine learning method for graph-structured data. A plethora of hardware accelerators has been introduced to meet the performance demands of GN...
Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREE	Adeel Ahmad, Ahmad Tameem Kamal, Nouman Amir, Bilal Zafar, Saad Bin Nasir	2025-07-07	下载	This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to lin...
ViPSN 2.0: A Reconfigurable Battery-free IoT Platform for Vibration Energy Harvesting	Xin Li, Mianxin Xiao, Xi Shen, Jiaqing Chu, Weifeng Huang, Jiashun Li, Yaoyi Li, Mingjing Cai, Jiaming Chen, Xinming Zhang, Daxing Zhang, Congsi Wang, Hong Tang, Bao Zhao, Qitao Lu, Yilong Wang, Jianjun Wang, Minyi Xu, Shitong Fang, Xuanyu Huang. Chaoyang Zhao, Zicheng Liu, Yaowen Yang, Guobiao Hu, Junrui Liang, Wei-Hsin Liao	2025-07-07	下载	Vibration energy harvesting is a promising solution for powering battery-free IoT systems; however, the instability of ambient vibrations presents significant challenges, such as limited harvested ene...
Optimizing Scalable Multi-Cluster Architectures for Next-Generation Wireless Sensing and Communication	Samuel Riedel, Yichao Zhang, Marco Bertuletti, Luca Benini	2025-07-07	下载	Next-generation wireless technologies (for immersive-massive communication, joint communication and sensing) demand highly parallel architectures for massive data processing.
Jack Unit: An Area- and Energy-Efficient Multiply-Accumulate (MAC) Unit Supporting Diverse Data Formats	Seock-Hwan Noh, Sungju Kim, Seohyun Kim, Daehoon Kim, Jaeha Kung, Yeseong Kim	2025-07-07	下载	In this work, we introduce an area- and energy-efficient multiply-accumulate (MAC) unit, named Jack unit, that is a jack-of-all-trades, supporting various data formats such as integer (INT), floating ...
ChipSeek-R1: Generating Human-Surpassing RTL with LLM via Hierarchical Reward-Driven Reinforcement Learning	Zhirong Chen, Kaiyan Chang, Zhuolin Li, Xinyang He, Chujie Chen, Cangyuan Li, Mengdi Wang, Haobo Xu, Yinhe Han, Ying Wang	2025-07-07	下载	Large Language Models (LLMs) show significant potential for automating Register-Transfer Level (RTL) code generation. However, current approaches face a critical challenge: they can not simultaneously...
NeuroPDE: A Neuromorphic PDE Solver Based on Spintronic and Ferroelectric Devices	Siqing Fu, Lizhou Wu, Tiejun Li, Chunyuan Zhang, Sheng Ma, Jianmin Zhang, Yuhan Tang, Jixuan Tang	2025-07-07	下载	In recent years, new methods for solving partial differential equations (PDEs) such as Monte Carlo random walk methods have gained considerable attention.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Helix Parallelism: Rethinking Sharding Strategies for Interactive Multi-Million-Token LLM Decoding	Nidhi Bhatia, Ankit More, Ritika Borkar, Tiyasa Mitra, Ramon Matas, Ritchie Zhao, Maximilian Golub, Dheevatsa Mudigere, Brian Pharris, Bita Darvish Rouhani	2025-07-07	下载	As LLMs scale to multi-million-token KV histories, real-time autoregressive decoding under tight Token-to-Token Latency (TTL) constraints faces growing pressure.
Cooperative Gradient Coding	Shudi Weng, Ming Xiao, Chao Ren, Mikael Skoglund	2025-07-07	下载	This work studies gradient coding (GC) in the context of distributed training problems with unreliable communication. We propose cooperative GC (CoGC), a novel gradient-sharing-based GC framework that...
MoLink: Distributed and Efficient Serving Framework for Large Models	Lewei Jin, Yongqi Chen, Kui Zhang, Yifan Zhuo, Yi Gao, Bowei Yang, Zhengong Cai, Wei Dong	2025-07-07	下载	Large language models represent a groundbreaking shift in generative AI. Yet, these advances come with a significant challenge: the high cost of model serving.
Silent Failures in Stateless Systems: Rethinking Anomaly Detection for Serverless Computing	Chanh Nguyen, Erik Elmroth, Monowar Bhuyan	2025-07-07	下载	Serverless computing has redefined cloud application deployment by abstracting infrastructure and enabling on-demand, event-driven execution, thereby enhancing developer agility and scalability.
Distributed Approximation Algorithms for Minimum Dominating Set in Locally Nice Graphs	Marthe Bonamy, Cyril Gavoille, Timothé Picavet, Alexandra Wesolek	2025-07-07	下载	We give a new, short proof that graphs embeddable in a given Euler genus- $g$ surface admit a simple $f(g)$ -round α-approximation distributed algorithm for Minimum Dominating Set (MDS), where the app...
Bullshark on Narwhal: Implementation-level Workflow Analysis of Round-based DAG Consensus in Theory and Practice	Yusei Tanaka	2025-07-07	下载	Round-based DAGs enable high-performance Byzantine fault-tolerant consensus, yet their technical advantages remain underutilized due to their short history.
BackFed: An Efficient & Standardized Benchmark Suite for Backdoor Attacks in Federated Learning	Thinh Dao, Dung Thuy Nguyen, Khoa D Doan, Kok-Seng Wong	2025-07-07	下载	Research on backdoor attacks in Federated Learning (FL) has accelerated in recent years, with new attacks and defenses continually proposed in an escalating arms race.
Spattack: Subgroup Poisoning Attacks on Federated Recommender Systems	Bo Yan, Yurong Hao, Dingqi Liu, Huabin Sun, Pengpeng Qiao, Wei Yang Bryan Lim, Yang Cao, Chuan Shi	2025-07-07	下载	Federated recommender systems (FedRec) have emerged as a promising approach to provide personalized recommendations while protecting user privacy.
High Order Collaboration-Oriented Federated Graph Neural Network for Accurate QoS Prediction	Zehuan Chen, Xiangwei Lai	2025-07-07	下载	Predicting Quality of Service (QoS) data crucial for cloud service selection, where user privacy is a critical concern. Federated Graph Neural Networks (FGNNs) can perform QoS data prediction as well ...
Demystifying NCCL: An In-depth Analysis of GPU Communication Protocols and Algorithms	Zhiyi Hu, Siyuan Shen, Tommaso Bonato, Sylvain Jeaugey, Cedell Alexander, Eric Spada, James Dinan, Jeff Hammond, Torsten Hoefler	2025-07-07	下载	The NVIDIA Collective Communication Library (NCCL) is a critical software layer enabling high-performance collectives on large-scale GPU clusters.
Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan)	Jesper Larsson Träff	2025-07-07	下载	Parallel scan primitives compute element-wise inclusive or exclusive prefix sums of input vectors contributed by $p$ consecutively ranked processors under an associative, binary operator $\oplus$ .
Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation	Daichi Mukunoki, Shun-ichiro Hayashi, Tetsuya Hoshino, Takahiro Katagiri	2025-07-07	下载	Generative AI technology based on Large Language Models (LLM) has been developed and applied to assist or automatically generate program codes.
RAPTOR: Practical Numerical Profiling of Scientific Applications	Faveo Hoerold, Ivan R. Ivanov, Akash Dhruv, William S. Moses, Anshu Dubey, Mohamed Wahib, Jens Domke	2025-07-07	下载	The proliferation of low-precision units in modern high-performance architectures increasingly burdens domain scientists. Historically, the choice in HPC was easy: can we get away with 32 bit floating...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Age-Aware CSI Acquisition of a Finite-State Markovian Channel	Onur Ayan, Jiping Luo, Xueli An, Nikolaos Pappas	2025-07-07	下载	The Age of Information (AoI) has emerged as a critical metric for quantifying information freshness; however, its interplay with channel estimation in partially observable wireless systems remains und...
User Association in the Presence of Jamming in Wireless Networks Using the Whittle Index	Pramod N Chine, Suven Jagtiani, Mandar R Nalavade, Gaurav S Kasbekar	2025-07-07	下载	In wireless networks, algorithms for user association, i.e., the task of choosing the base station (BS) that every arriving user should join, significantly impact the network performance.
Large Language Models for Network Intrusion Detection Systems: Foundations, Implementations, and Future Directions	Shuo Yang, Xinran Zheng, Xinchen Zhang, Jinfeng Xu, Jinze Li, Donglin Xie, Weicai Long, Edith C. H. Ngai	2025-07-07	下载	Large Language Models (LLMs) have revolutionized various fields with their exceptional capabilities in understanding, processing, and generating human-like text.
Low-Latency Software Polar Encoders and Decoders for Short Blocklengths	Mathieu Leonardon, Mohammed El Houcine Ayoubi, Adrien Cassagne, Romain Tajan, Camille Leroux	2025-07-07	下载	This paper presents our low-latency Polar code encoders and decoders developed for the 2025 International Symposium on Topics in Coding (ISTC 2025) contest, which challenges participants to implement ...
Synergistic Localization and Sensing in MIMO-OFDM Systems via Mixed-Integer Bilevel Learning	Zelin Zhu, Kai Yang, Rui Zhang	2025-07-07	下载	Wireless localization and sensing technologies are essential in modern wireless networks, supporting applications in smart cities, the Internet of Things (IoT), and autonomous systems.
Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences	Yusong Zhang, Yuxuan Sun, Lei Guo, Wei Chen, Bo Ai, Deniz Gunduz	2025-07-07	下载	6G networks promise revolutionary immersive communication experiences including augmented reality (AR), virtual reality (VR), and holographic communications.
On-Demand Multimedia Delivery in 6G: An Optimal-Cost Steiner Tree Approach	Zien Wang, Xiucheng Wang, Nan Cheng, Wenchao Xu, Wei Quan, Ruijin Sun, Conghao Zhou	2025-07-07	下载	The exponential growth of multimedia data traffic in 6G networks poses unprecedented challenges for immersive communication, where ultra-high-definition, multi-quality streaming must be delivered on d...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing	Damien Fovet, Shashank Chamoli, Sarah Oury, Srishti Singhal	2025-07-07	下载	This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}.
Hardware-efficient tractable probabilistic inference for TinyML Neurosymbolic AI applications	Jelin Leslin, Martin Trapp, Martin Andraud	2025-07-07	下载	Neurosymbolic AI (NSAI) has recently emerged to mitigate limitations associated with deep learning (DL) models, e.g. quantifying their uncertainty or reason with explicit rules.