2025-04-24

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
ApproXAI: Energy-Efficient Hardware Acceleration of Explainable AI using Approximate Computing	Ayesha Siddique, Khurram Khalil, Khaza Anuarul Hoque	2025-04-24	下载	Explainable artificial intelligence (XAI) enhances AI system transparency by framing interpretability as an optimization problem. However, this approach often necessitates numerous iterations of compu...
Biting the CHERI bullet: Blockers, Enablers and Security Implications of CHERI in Defence	Shamal Faily	2025-04-24	下载	There is growing interest in securing the hardware foundations software stacks build upon. However, before making any investment decision, software and hardware supply chain stakeholders require evide...
L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference	Qingyuan Liu, Liyan Chen, Yanning Yang, Haocheng Wang, Dong Du, Zhigang Mao, Naifeng Jing, Yubin Xia, Haibo Chen	2025-04-24	下载	Large Language Models (LLMs) increasingly require processing long text sequences, but GPU memory limitations force difficult trade-offs between memory capacity and bandwidth.
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration	Maoyang Xiang, Ramesh Fernando, Bo Wang	2025-04-24	下载	Transformer-based Large Language Models (LLMs) have significantly advanced AI capabilities but pose considerable challenges for deployment on edge devices due to high computational demands, memory ban...
Fine-Grained Fusion: The Missing Piece in Area-Efficient State Space Model Acceleration	Robin Geens, Arne Symons, Marian Verhelst	2025-04-24	下载	State Space Models (SSMs) offer a promising alternative to transformers for long-sequence processing. However, their efficiency remains hindered by memory-bound operations, particularly in the prefill...
FLAG: Formal and LLM-assisted SVA Generation for Formal Specifications of On-Chip Communication Protocols	Yu-An Shih, Annie Lin, Aarti Gupta, Sharad Malik	2025-04-24	下载	Formal specifications of on-chip communication protocols are crucial for system-on-chip (SoC) design and verification. However, manually constructing these formal specifications from informal document...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
EPSILON: Adaptive Fault Mitigation in Approximate Deep Neural Network using Statistical Signatures	Khurram Khalil, Khaza Anuarul Hoque	2025-04-24	下载	The increasing adoption of approximate computing in deep neural network accelerators (AxDNNs) promises significant energy efficiency gains. However, permanent faults in AxDNNs can severely degrade the...
Optimized Cloud Resource Allocation Using Genetic Algorithms for Energy Efficiency and QoS Assurance	Caroline Panggabean, Devaraj Verma C, Bhagyashree Gogoi, Ranju Limbu, Rhythm Sarker	2025-04-24	下载	Cloud computing environments demand dynamic and efficient resource management to ensure optimal performance, reduced energy consumption, and adherence to Service Level Agreements (SLAs).
Cross-region Model Training with Communication-Computation Overlapping and Delay Compensation	Ying Zhu, Yang Xu, Hongli Xu, Yunming Liao, Zhiwei Yao, Liusheng Huang	2025-04-24	下载	Training large language models (LLMs) requires massive computational resources, often necessitating the aggregation of geographically distributed data centers (\ie, cross-region training).
TSUE: A Two-Stage Data Update Method for an Erasure Coded Cluster File System	Zheng Wei, Jing Xing, Yida Gu, Wenjing Huang, Dong Dai, Guangming Tan, Dingwen Tao	2025-04-24	下载	Compared to replication-based storage systems, erasure-coded storage incurs significantly higher overhead during data updates. To address this issue, various parity logging methods have been pro- pose...
Shared Randomness in Locally Checkable Problems: The Role of Computational Assumptions	Adar Hadad, Moni Naor	2025-04-24	下载	Shared randomness is a valuable resource in distributed computing, allowing some form of coordination between processors without explicit communication.
Communication-Efficient Personalized Distributed Learning with Data and Node Heterogeneity	Zhuojun Tian, Zhaoyang Zhang, Yiwei Li, Mehdi Bennis	2025-04-24	下载	To jointly tackle the challenges of data and node heterogeneity in decentralized learning, we propose a distributed strong lottery ticket hypothesis (DSLTH), based on which a communication-efficient p...
GRANITE : a Byzantine-Resilient Dynamic Gossip Learning Framework	Yacine Belal, Mohamed Maouche, Sonia Ben Mokhtar, Anthony Simonet-Boulogne	2025-04-24	下载	Gossip Learning (GL) is a decentralized learning paradigm where users iteratively exchange and aggregate models with a small set of neighboring peers.
CHASe: Client Heterogeneity-Aware Data Selection for Effective Federated Active Learning	Jun Zhang, Jue Wang, Huan Li, Zhongle Xie, Ke Chen, Lidan Shou	2025-04-24	下载	Active learning (AL) reduces human annotation costs for machine learning systems by strategically selecting the most informative unlabeled data for annotation, but performing it individually may still...
Dynamic Approximate Maximum Matching in the Distributed Vertex Partition Model	Peter Robinson, Xianbin Zhu	2025-04-24	下载	We initiate the study of approximate maximum matching in the vertex partition model, for graphs subject to dynamic changes. We assume that the $n$ vertices of the graph are partitioned among $k$ playe...
JITServe: SLO-aware LLM Serving with Imprecise Request Information	Wei Zhang, Zhiyu Wu, Yi Mu, Rui Ning, Banruo Liu, Nikhil Sarda, Myungjin Lee, Fan Lai	2025-04-24	下载	The integration of Large Language Models (LLMs) into applications ranging from interactive chatbots to multi-agent systems has introduced a wide spectrum of service-level objectives (SLOs) for respons...
Developing a Blockchain-Based Secure Digital Contents Distribution System	Syed Mohiuddin Qadri, Sangwhan Cha	2025-04-24	下载	As digital content distribution expands rapidly through online platforms, securing digital media and protecting intellectual property has become increasingly complex.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Toward Low-Latency Services over PON using OCDMA Private Networks	Steevy J. Cordette	2025-04-24	下载	An low-latency service scheme is proposed over Passive Optical Network (PON). The Optical Code Division Multiplexing Access (OCDMA) technique is used to define multiple private networks serving as Vir...
STGen: A Novel Lightweight IoT Testbed for Generating Sensor Traffic for the Experimentation of IoT Protocol and its Application in Hybrid Network	Hasan MA Islam, S. Nath, M. Rahman, N. Shahriar, M. K. M. Khan, R. Islam	2025-04-24	下载	A Wireless Sensor Network (WSN) is a network that does not rely on a fixed infrastructure and consists of numerous sensors, such as temperature, humidity, GPS, and cameras, equipped with onboard proce...
Mitigating xApp conflicts for efficient network slicing in 6G O-RAN: a graph convolutional-based attention network approach	Sihem Bakri, Indrakshi Dey, Harun Siljak, Marco Ruffini, Nicola Marchetti	2025-04-24	下载	O-RAN (Open-Radio Access Network) offers a flexible, open architecture for next-generation wireless networks. Network slicing within O-RAN allows network operators to create customized virtual network...
An All-Optical Metro Network Architecture and QoS-Aware Wavelength Allocation Study for Converged Fixed, Mobile, and Edge Computing Multi-Granular Traffic	David Georgantas, Zhaoyang Liu, Georgios Drainakis, Bitao Pan, Adonis Bogris, Peristera Baziana	2025-04-24	下载	In this paper, we introduce an all-optical metro network architecture, called MOON, to serve converged multigranular traffic from fixed, mobile, and edge computing services.
An Extensible Software Transport Layer for GPU Networking	Yang Zhou, Zhongjie Chen, Ziming Mao, ChonLam Lao, Shuo Yang, Pravein Govindan Kannan, Jiaqi Gao, Yilong Zhao, Yongji Wu, Kaichao You, Fengyuan Ren, Zhiying Xu, Costin Raiciu, Ion Stoica	2025-04-24	下载	Fast-evolving machine learning (ML) workloads have increasing requirements for networking. However, host network transport on RDMA NICs is hard to evolve, causing problems for ML workloads.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Proto: A Guided Journey through Modern OS Construction	Wonkyo Choe, Rongxiang Wang, Afsara Benazir, Felix Xiaozhu Lin	2025-04-24	下载	Proto is a new instructional OS that runs on commodity, portable hardware. It showcases modern features, including per-app address spaces, threading, commodity filesystems, USB, DMA, multicore support...
Biting the CHERI bullet: Blockers, Enablers and Security Implications of CHERI in Defence	Shamal Faily	2025-04-24	下载	There is growing interest in securing the hardware foundations software stacks build upon. However, before making any investment decision, software and hardware supply chain stakeholders require evide...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
PHast -- Perfect Hashing made fast	Piotr Beling, Peter Sanders	2025-04-24	下载	Perfect hash functions give unique "names" to arbitrary keys requiring only a few bits per key. This is an essential building block in applications like static hash tables, databases, or bioinformatic...
PowerSensor3: A Fast and Accurate Open Source Power Measurement Tool	Steven van der Vlugt, Leon Oostrum, Gijs Schoonderbeek, Ben van Werkhoven, Bram Veenboer, Krijn Doekemeijer, John W. Romein	2025-04-24	下载	Power consumption is a major concern in data centers and HPC applications, with GPUs typically accounting for more than half of system power usage.