Skip to content

2025-07-07

cs.AR - Architecture

标题作者发布日期PDF摘要
Bit-Flip Fault Attack: Crushing Graph Neural Networks via Gradual Bit SearchSanaz Kazemi Abharian, Sai Manoj Pudukotai Dinakarrao2025-07-07下载Graph Neural Networks (GNNs) have emerged as a powerful machine learning method for graph-structured data. A plethora of hardware accelerators has been introduced to meet the performance demands of GN...
Accelerating GenAI Workloads by Enabling RISC-V Microkernel Support in IREEAdeel Ahmad, Ahmad Tameem Kamal, Nouman Amir, Bilal Zafar, Saad Bin Nasir2025-07-07下载This project enables RISC-V microkernel support in IREE, an MLIR-based machine learning compiler and runtime. The approach begins by enabling the lowering of MLIR linalg dialect contraction ops to lin...
ViPSN 2.0: A Reconfigurable Battery-free IoT Platform for Vibration Energy HarvestingXin Li, Mianxin Xiao, Xi Shen, Jiaqing Chu, Weifeng Huang, Jiashun Li, Yaoyi Li, Mingjing Cai, Jiaming Chen, Xinming Zhang, Daxing Zhang, Congsi Wang, Hong Tang, Bao Zhao, Qitao Lu, Yilong Wang, Jianjun Wang, Minyi Xu, Shitong Fang, Xuanyu Huang. Chaoyang Zhao, Zicheng Liu, Yaowen Yang, Guobiao Hu, Junrui Liang, Wei-Hsin Liao2025-07-07下载Vibration energy harvesting is a promising solution for powering battery-free IoT systems; however, the instability of ambient vibrations presents significant challenges, such as limited harvested ene...
Optimizing Scalable Multi-Cluster Architectures for Next-Generation Wireless Sensing and CommunicationSamuel Riedel, Yichao Zhang, Marco Bertuletti, Luca Benini2025-07-07下载Next-generation wireless technologies (for immersive-massive communication, joint communication and sensing) demand highly parallel architectures for massive data processing.
Jack Unit: An Area- and Energy-Efficient Multiply-Accumulate (MAC) Unit Supporting Diverse Data FormatsSeock-Hwan Noh, Sungju Kim, Seohyun Kim, Daehoon Kim, Jaeha Kung, Yeseong Kim2025-07-07下载In this work, we introduce an area- and energy-efficient multiply-accumulate (MAC) unit, named Jack unit, that is a jack-of-all-trades, supporting various data formats such as integer (INT), floating ...
ChipSeek-R1: Generating Human-Surpassing RTL with LLM via Hierarchical Reward-Driven Reinforcement LearningZhirong Chen, Kaiyan Chang, Zhuolin Li, Xinyang He, Chujie Chen, Cangyuan Li, Mengdi Wang, Haobo Xu, Yinhe Han, Ying Wang2025-07-07下载Large Language Models (LLMs) show significant potential for automating Register-Transfer Level (RTL) code generation. However, current approaches face a critical challenge: they can not simultaneously...
NeuroPDE: A Neuromorphic PDE Solver Based on Spintronic and Ferroelectric DevicesSiqing Fu, Lizhou Wu, Tiejun Li, Chunyuan Zhang, Sheng Ma, Jianmin Zhang, Yuhan Tang, Jixuan Tang2025-07-07下载In recent years, new methods for solving partial differential equations (PDEs) such as Monte Carlo random walk methods have gained considerable attention.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Helix Parallelism: Rethinking Sharding Strategies for Interactive Multi-Million-Token LLM DecodingNidhi Bhatia, Ankit More, Ritika Borkar, Tiyasa Mitra, Ramon Matas, Ritchie Zhao, Maximilian Golub, Dheevatsa Mudigere, Brian Pharris, Bita Darvish Rouhani2025-07-07下载As LLMs scale to multi-million-token KV histories, real-time autoregressive decoding under tight Token-to-Token Latency (TTL) constraints faces growing pressure.
Cooperative Gradient CodingShudi Weng, Ming Xiao, Chao Ren, Mikael Skoglund2025-07-07下载This work studies gradient coding (GC) in the context of distributed training problems with unreliable communication. We propose cooperative GC (CoGC), a novel gradient-sharing-based GC framework that...
MoLink: Distributed and Efficient Serving Framework for Large ModelsLewei Jin, Yongqi Chen, Kui Zhang, Yifan Zhuo, Yi Gao, Bowei Yang, Zhengong Cai, Wei Dong2025-07-07下载Large language models represent a groundbreaking shift in generative AI. Yet, these advances come with a significant challenge: the high cost of model serving.
Silent Failures in Stateless Systems: Rethinking Anomaly Detection for Serverless ComputingChanh Nguyen, Erik Elmroth, Monowar Bhuyan2025-07-07下载Serverless computing has redefined cloud application deployment by abstracting infrastructure and enabling on-demand, event-driven execution, thereby enhancing developer agility and scalability.
Distributed Approximation Algorithms for Minimum Dominating Set in Locally Nice GraphsMarthe Bonamy, Cyril Gavoille, Timothé Picavet, Alexandra Wesolek2025-07-07下载We give a new, short proof that graphs embeddable in a given Euler genus-gg surface admit a simple f(g)f(g)-round α-approximation distributed algorithm for Minimum Dominating Set (MDS), where the app...
Bullshark on Narwhal: Implementation-level Workflow Analysis of Round-based DAG Consensus in Theory and PracticeYusei Tanaka2025-07-07下载Round-based DAGs enable high-performance Byzantine fault-tolerant consensus, yet their technical advantages remain underutilized due to their short history.
BackFed: An Efficient & Standardized Benchmark Suite for Backdoor Attacks in Federated LearningThinh Dao, Dung Thuy Nguyen, Khoa D Doan, Kok-Seng Wong2025-07-07下载Research on backdoor attacks in Federated Learning (FL) has accelerated in recent years, with new attacks and defenses continually proposed in an escalating arms race.
Spattack: Subgroup Poisoning Attacks on Federated Recommender SystemsBo Yan, Yurong Hao, Dingqi Liu, Huabin Sun, Pengpeng Qiao, Wei Yang Bryan Lim, Yang Cao, Chuan Shi2025-07-07下载Federated recommender systems (FedRec) have emerged as a promising approach to provide personalized recommendations while protecting user privacy.
High Order Collaboration-Oriented Federated Graph Neural Network for Accurate QoS PredictionZehuan Chen, Xiangwei Lai2025-07-07下载Predicting Quality of Service (QoS) data crucial for cloud service selection, where user privacy is a critical concern. Federated Graph Neural Networks (FGNNs) can perform QoS data prediction as well ...
Demystifying NCCL: An In-depth Analysis of GPU Communication Protocols and AlgorithmsZhiyi Hu, Siyuan Shen, Tommaso Bonato, Sylvain Jeaugey, Cedell Alexander, Eric Spada, James Dinan, Jeff Hammond, Torsten Hoefler2025-07-07下载The NVIDIA Collective Communication Library (NCCL) is a critical software layer enabling high-performance collectives on large-scale GPU clusters.
Communication Round and Computation Efficient Exclusive Prefix-Sums Algorithms (for MPI_Exscan)Jesper Larsson Träff2025-07-07下载Parallel scan primitives compute element-wise inclusive or exclusive prefix sums of input vectors contributed by pp consecutively ranked processors under an associative, binary operator \oplus.
Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code GenerationDaichi Mukunoki, Shun-ichiro Hayashi, Tetsuya Hoshino, Takahiro Katagiri2025-07-07下载Generative AI technology based on Large Language Models (LLM) has been developed and applied to assist or automatically generate program codes.
RAPTOR: Practical Numerical Profiling of Scientific ApplicationsFaveo Hoerold, Ivan R. Ivanov, Akash Dhruv, William S. Moses, Anshu Dubey, Mohamed Wahib, Jens Domke2025-07-07下载The proliferation of low-precision units in modern high-performance architectures increasingly burdens domain scientists. Historically, the choice in HPC was easy: can we get away with 32 bit floating...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Age-Aware CSI Acquisition of a Finite-State Markovian ChannelOnur Ayan, Jiping Luo, Xueli An, Nikolaos Pappas2025-07-07下载The Age of Information (AoI) has emerged as a critical metric for quantifying information freshness; however, its interplay with channel estimation in partially observable wireless systems remains und...
User Association in the Presence of Jamming in Wireless Networks Using the Whittle IndexPramod N Chine, Suven Jagtiani, Mandar R Nalavade, Gaurav S Kasbekar2025-07-07下载In wireless networks, algorithms for user association, i.e., the task of choosing the base station (BS) that every arriving user should join, significantly impact the network performance.
Large Language Models for Network Intrusion Detection Systems: Foundations, Implementations, and Future DirectionsShuo Yang, Xinran Zheng, Xinchen Zhang, Jinfeng Xu, Jinze Li, Donglin Xie, Weicai Long, Edith C. H. Ngai2025-07-07下载Large Language Models (LLMs) have revolutionized various fields with their exceptional capabilities in understanding, processing, and generating human-like text.
Low-Latency Software Polar Encoders and Decoders for Short BlocklengthsMathieu Leonardon, Mohammed El Houcine Ayoubi, Adrien Cassagne, Romain Tajan, Camille Leroux2025-07-07下载This paper presents our low-latency Polar code encoders and decoders developed for the 2025 International Symposium on Topics in Coding (ISTC 2025) contest, which challenges participants to implement ...
Synergistic Localization and Sensing in MIMO-OFDM Systems via Mixed-Integer Bilevel LearningZelin Zhu, Kai Yang, Rui Zhang2025-07-07下载Wireless localization and sensing technologies are essential in modern wireless networks, supporting applications in smart cities, the Internet of Things (IoT), and autonomous systems.
Multimodal LLM Integrated Semantic Communications for 6G Immersive ExperiencesYusong Zhang, Yuxuan Sun, Lei Guo, Wei Chen, Bo Ai, Deniz Gunduz2025-07-07下载6G networks promise revolutionary immersive communication experiences including augmented reality (AR), virtual reality (VR), and holographic communications.
On-Demand Multimedia Delivery in 6G: An Optimal-Cost Steiner Tree ApproachZien Wang, Xiucheng Wang, Nan Cheng, Wenchao Xu, Wei Quan, Ruijin Sun, Conghao Zhou2025-07-07下载The exponential growth of multimedia data traffic in 6G networks poses unprecedented challenges for immersive communication, where ultra-high-definition, multi-quality streaming must be delivered on d...

cs.PF - Performance

标题作者发布日期PDF摘要
Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse ComputingDamien Fovet, Shashank Chamoli, Sarah Oury, Srishti Singhal2025-07-07下载This study evaluates the performance of a compression method, called CompactifAI, developed by Multiverse Computing, applied to the large language model Llama 3.1 8B\cite{llama}.
Hardware-efficient tractable probabilistic inference for TinyML Neurosymbolic AI applicationsJelin Leslin, Martin Trapp, Martin Andraud2025-07-07下载Neurosymbolic AI (NSAI) has recently emerged to mitigate limitations associated with deep learning (DL) models, e.g. quantifying their uncertainty or reason with explicit rules.

基于 VitePress 构建