2025-09-17

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
ZKProphet: Understanding Performance of Zero-Knowledge Proofs on GPUs	Tarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin	2025-09-17	下载	Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret.
eIQ Neutron: Redefining Edge-AI Inference with Integrated NPU and Compiler Innovations	Lennart Bamberg, Filippo Minnella, Roberto Bosio, Fabrizio Ottati, Yuebin Wang, Jongmin Lee, Luciano Lavagno, Adam Fuks	2025-09-17	下载	Neural Processing Units (NPUs) are key to enabling efficient AI inference in resource-constrained edge environments. While peak tera operations per second (TOPS) is often used to gauge performance, it...
A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction Caching	Henry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi	2025-09-17	下载	Modern mobile CPU software pose challenges for conventional instruction cache replacement policies due to their complex runtime behavior causing high reuse distance between executions of the same inst...
An RDMA-First Object Storage System with SmartNIC Offload	Yu Zhu, Aditya Dhakal, Pedro Bruel, Gourav Rattihalli, Yunming Xiao, Johann Lombardi, Dejan Milojicic	2025-09-17	下载	AI training and inference impose sustained, fine-grain I/O that stresses host-mediated, TCP-based storage paths. Motivated by kernel-bypass networking and user-space storage stacks, we revisit POSIX-c...
VerilogMonkey: Exploring Parallel Scaling for Automated Verilog Code Generation with LLMs	Juxin Niu, Yuxin Du, Dan Niu, Xi Wang, Zhe Jiang, Nan Guan	2025-09-17	下载	We present VerilogMonkey, an empirical study of parallel scaling for the under-explored task of automated Verilog generation. Parallel scaling improves LLM performance by sampling many outputs in para...
TENET: An Efficient Sparsity-Aware LUT-Centric Architecture for Ternary LLM Inference On Edge	Zhirui Huang, Rui Ma, Shijie Cao, Ran Shu, Ian Wang, Ting Cao, Chixiao Chen, Yongqiang Xiong	2025-09-17	下载	Ternary quantization has emerged as a powerful technique for reducing both computational and memory footprint of large language models (LLM), enabling efficient real-time inference deployment without ...
CompAir: Synergizing Complementary PIMs and In-Transit NoC Computation for Efficient LLM Acceleration	Hongyi Li, Songchen Ma, Huanyu Qu, Weihao Zhang, Jia Chen, Junfeng Lin, Fengbin Tu, Rong Zhao	2025-09-17	下载	The rapid advancement of Large Language Models (LLMs) has revolutionized various aspects of human life, yet their immense computational and energy demands pose significant challenges for efficient inf...
StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMs	Hanchen Ye, Deming Chen	2025-09-17	下载	Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Scaling Hybrid Quantum-HPC Applications with the Quantum Framework	Srikar Chundury, Amir Shehata, Seongmin Kim, Muralikrishnan Gopalakrishnan Meena, Chao Lu, Kalyana Gottiparthi, Eduardo Antonio Coello Perez, Frank Mueller, In-Saeng Suh	2025-09-17	下载	Hybrid quantum-high performance computing (Q-HPC) workflows are emerging as a key strategy for running quantum applications at scale in current noisy intermediate-scale quantum (NISQ) devices.
ZKProphet: Understanding Performance of Zero-Knowledge Proofs on GPUs	Tarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin	2025-09-17	下载	Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret.
Julia GraphBLAS with Nonblocking Execution	Pascal Costanza, Timothy G. Mattson, Raye Kimmerer, Benjamin Brock	2025-09-17	下载	From the beginning, the GraphBLAS were designed for ``nonblocking execution''; i.e., calls to GraphBLAS methods return as soon as the arguments to the methods are validated and define a directed acycl...
A Closeness Centrality-based Circuit Partitioner for Quantum Simulations	Doru Thom Popovici, Harlin Lee, Mauro Del Ben, Naoki Yoshioka, Nobuyasu Ito, Katherine Klymko, Daan Camps, Anastasiia Butko	2025-09-17	下载	Simulating quantum circuits (QC) on high-performance computing (HPC) systems has become an essential method to benchmark algorithms and probe the potential of large-scale quantum computation despite t...
LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation Methodology	Renan Souza, Timothy Poteet, Brian Etz, Daniel Rosendo, Amal Gueroudji, Woong Shin, Prasanna Balaprakash, Rafael Ferreira da Silva	2025-09-17	下载	Modern scientific discovery increasingly relies on workflows that process data across the Edge, Cloud, and High Performance Computing (HPC) continuum.
Adaptive Client Selection via Q-Learning-based Whittle Index in Wireless Federated Learning	Qiyue Li, Yingxin Liu, Hang Qi, Jieping Luo, Zhizhang Liu, Jingjin Wu	2025-09-17	下载	We consider the client selection problem in wireless Federated Learning (FL), with the objective of reducing the total required time to achieve a certain level of learning accuracy.
Graph-Regularized Learning of Gaussian Mixture Models	Shamsiiat Abdurakhmanova, Alex Jung	2025-09-17	下载	We present a graph-regularized learning of Gaussian Mixture Models (GMMs) in distributed settings with heterogeneous and limited local data. The method exploits a provided similarity graph to guide pa...
ParaAegis: Parallel Protection for Flexible Privacy-preserved Federated Learning	Zihou Wu, Yuecheng Li, Tianchi Liao, Jian Lou, Chuan Chen	2025-09-17	下载	Federated learning (FL) faces a critical dilemma: existing protection mechanisms like differential privacy (DP) and homomorphic encryption (HE) enforce a rigid trade-off, forcing a choice between mode...
FLAME: A Serving System Optimized for Large-Scale Generative Recommendation with Efficiency	Xianwen Guo, Bin Huang, Xiaomeng Wu, Guanlin Wu, Fangjian Li, Shijia Wang, Qiang Xiao, Chuanjiang Luo, Yong Li	2025-09-17	下载	Generative recommendation (GR) models possess greater scaling power compared to traditional deep learning recommendation models (DLRMs), yet they also impose a tremendous increase in computational bur...
GPU Programming for AI Workflow Development on AWS SageMaker: An Instructional Approach	Sriram Srinivasan, Hamdan Alabsi, Rand Obeidat, Nithisha Ponnala, Azene Zenebe	2025-09-17	下载	We present the design, implementation, and comprehensive evaluation of a specialized course on GPU architecture, GPU programming, and how these are used for developing AI agents.
Federated Learning for Deforestation Detection: A Distributed Approach with Satellite Imagery	Yuvraj Dutta, Aaditya Sikder, Basabdatta Palit	2025-09-17	下载	Accurate identification of deforestation from satellite images is essential in order to understand the geographical situation of an area. This paper introduces a new distributed approach to identify a...
Secure, Scalable and Privacy Aware Data Strategy in Cloud	Vijay Kumar Butte, Sujata Butte	2025-09-17	下载	The enterprises today are faced with the tough challenge of processing, storing large amounts of data in a secure, scalable manner and enabling decision makers to make quick, informed data driven deci...
Taming Serverless Cold Starts Through OS Co-Design	Ben Holmes, Baltasar Dinis, Lana Honcharuk, Joshua Fried, Adam Belay	2025-09-17	下载	Serverless computing promises fine-grained elasticity and operational simplicity, fueling widespread interest from both industry and academia.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
TinyAC: Bringing Autonomic Computing Principles to Resource-Constrained Systems	Wojciech Kalka, Ruitao Xue, Kamil Faber, Aleksander Slominski, Devki Jha, Rajiv Ranjan, Tomasz Szydlo	2025-09-17	下载	Autonomic Computing (AC) is a promising approach for developing intelligent and adaptive self-management systems at the deep network edge. In this paper, we present the problems and challenges related...
Active Inference Framework for Closed-Loop Sensing, Communication, and Control in UAV Systems	Guangjin Pan, Liping Bai, Zhuojun Tian, Hui Chen, Mehdi Bennis, Henk Wymeersch	2025-09-17	下载	Integrated sensing and communication (ISAC) is a core technology for 6G, and its application to closed-loop sensing, communication, and control (SCC) enables various services.
RepCaM++: Exploring Transparent Visual Prompt With Inference-Time Re-Parameterization for Neural Video Delivery	Rongyu Zhang, Xize Duan, Jiaming Liu, Li Du, Yuan Du, Dan Wang, Shanghang Zhang, Fangxin Wang	2025-09-17	下载	Recently, content-aware methods have been employed to reduce bandwidth and enhance the quality of Internet video delivery. These methods involve training distinct content-aware super-resolution (SR) m...
Path-Oblivious Entanglement Swapping for the Quantum Internet	Vincent Mutolo, Rhea Parekh, Dan Rubenstein	2025-09-17	下载	Proposed Bell pair swapping protocols, an essential component of the Quantum Internet, are planned-path: specific, structured, routing paths are reserved prior to the execution of the swapping process...
Low-cost Highly-interoperable Multiplatform Campus Network: Experience of YARSI University	Surya Agustian, Sandra Permana, Salman Teguh Pratista, Syarifu Adam, Iswandi	2025-09-17	下载	To some organizations, building campus network is sometimes considered to be very expensive; and this has made the project uneasy to perform. Moreover, if the organization without sufficient IT knowle...
Performance Evaluation of Intent-Based Networking Scenarios: A GitOps and Nephio Approach	Saptarshi Ghosh, Ioannis Mavromatis, Konstantinos Antonakoglou, Konstantinos Katsaros	2025-09-17	下载	GitOps has emerged as a foundational paradigm for managing cloud-native infrastructures by enabling declarative configuration, version-controlled state, and automated reconciliation between intents an...
Domino: Dominant Path-based Compensation for Hardware Impairments in Modern WiFi Sensing	Ruiqi Kong, He Chen	2025-09-17	下载	WiFi sensing faces a critical reliability challenge due to hardware-induced RF distortions, especially with modern, market-dominant WiFi cards supporting 802.11ac/ax protocols.
A Survey and Evaluation Framework for Secure DNS Resolution	Ali Sadeghi Jahromi, AbdelRahman Abdou, Paul C. van Oorschot	2025-09-17	下载	Since security was not among the original design goals of the Domain Name System (herein called Vanilla DNS), many secure DNS schemes have been proposed to enhance the security and privacy of the DNS ...
Conducting Mission-Critical Voice Experiments with Automated Speech Recognition and Crowdsourcing	Jan Janak, Kahlil Dozier, Lauren Berny, Liang Hu, Dan Rubenstein, Charles Jennings, Henning Schulzrinne	2025-09-17	下载	Mission-critical voice (MCV) communications systems have been a critical tool for the public safety community for over eight decades. Public safety users expect MCV systems to operate reliably and con...
LINC: An In-Network Coding Approach to Tame Packet Loss in Hybrid Wireless-Fiber Backbones	Benoit Pit-Claudel, Muriel Médard, Manya Ghobadi	2025-09-17	下载	The emergence of ultra-low latency applications, such as financial transactions, has driven the development of hybrid backbone networks that rely on fiber, satellite, and microwave links.
A Framework for Multi-source Prefetching Through Adaptive Weight	Yoseph Berhanu Alebachew, Mulugeta Libsie	2025-09-17	下载	The World Wide Web has come to be a great part of our daily life, yet user observed latency is still a problem that needs a proper means of handling.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
A Task Equalization Allocation Algorithm Incorporating Blocking Estimation and Resource Similarity Analysis for Vehicle Control Real-Time Systems	Qianlong Duan, Bide Hao, Fan Zhou, Chen Fei, Shichun Yang	2025-09-17	下载	In multi-core real-time vehicle control systems, synchronization blocking and resource contention pose critical challenges due to increasing task parallelism and shared resource access.
A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction Caching	Henry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi	2025-09-17	下载	Modern mobile CPU software pose challenges for conventional instruction cache replacement policies due to their complex runtime behavior causing high reuse distance between executions of the same inst...
Taming Serverless Cold Starts Through OS Co-Design	Ben Holmes, Baltasar Dinis, Lana Honcharuk, Joshua Fried, Adam Belay	2025-09-17	下载	Serverless computing promises fine-grained elasticity and operational simplicity, fueling widespread interest from both industry and academia.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
ZKProphet: Understanding Performance of Zero-Knowledge Proofs on GPUs	Tarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin	2025-09-17	下载	Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret.
A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction Caching	Henry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi	2025-09-17	下载	Modern mobile CPU software pose challenges for conventional instruction cache replacement policies due to their complex runtime behavior causing high reuse distance between executions of the same inst...