Skip to content

2025-09-17

cs.AR - Architecture

标题作者发布日期PDF摘要
ZKProphet: Understanding Performance of Zero-Knowledge Proofs on GPUsTarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin2025-09-17下载Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret.
eIQ Neutron: Redefining Edge-AI Inference with Integrated NPU and Compiler InnovationsLennart Bamberg, Filippo Minnella, Roberto Bosio, Fabrizio Ottati, Yuebin Wang, Jongmin Lee, Luciano Lavagno, Adam Fuks2025-09-17下载Neural Processing Units (NPUs) are key to enabling efficient AI inference in resource-constrained edge environments. While peak tera operations per second (TOPS) is often used to gauge performance, it...
A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction CachingHenry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi2025-09-17下载Modern mobile CPU software pose challenges for conventional instruction cache replacement policies due to their complex runtime behavior causing high reuse distance between executions of the same inst...
An RDMA-First Object Storage System with SmartNIC OffloadYu Zhu, Aditya Dhakal, Pedro Bruel, Gourav Rattihalli, Yunming Xiao, Johann Lombardi, Dejan Milojicic2025-09-17下载AI training and inference impose sustained, fine-grain I/O that stresses host-mediated, TCP-based storage paths. Motivated by kernel-bypass networking and user-space storage stacks, we revisit POSIX-c...
VerilogMonkey: Exploring Parallel Scaling for Automated Verilog Code Generation with LLMsJuxin Niu, Yuxin Du, Dan Niu, Xi Wang, Zhe Jiang, Nan Guan2025-09-17下载We present VerilogMonkey, an empirical study of parallel scaling for the under-explored task of automated Verilog generation. Parallel scaling improves LLM performance by sampling many outputs in para...
TENET: An Efficient Sparsity-Aware LUT-Centric Architecture for Ternary LLM Inference On EdgeZhirui Huang, Rui Ma, Shijie Cao, Ran Shu, Ian Wang, Ting Cao, Chixiao Chen, Yongqiang Xiong2025-09-17下载Ternary quantization has emerged as a powerful technique for reducing both computational and memory footprint of large language models (LLM), enabling efficient real-time inference deployment without ...
CompAir: Synergizing Complementary PIMs and In-Transit NoC Computation for Efficient LLM AccelerationHongyi Li, Songchen Ma, Huanyu Qu, Weihao Zhang, Jia Chen, Junfeng Lin, Fengbin Tu, Rong Zhao2025-09-17下载The rapid advancement of Large Language Models (LLMs) has revolutionized various aspects of human life, yet their immense computational and energy demands pose significant challenges for efficient inf...
StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMsHanchen Ye, Deming Chen2025-09-17下载Efficient execution of deep learning workloads on dataflow architectures is crucial for overcoming memory bottlenecks and maximizing performance.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Scaling Hybrid Quantum-HPC Applications with the Quantum FrameworkSrikar Chundury, Amir Shehata, Seongmin Kim, Muralikrishnan Gopalakrishnan Meena, Chao Lu, Kalyana Gottiparthi, Eduardo Antonio Coello Perez, Frank Mueller, In-Saeng Suh2025-09-17下载Hybrid quantum-high performance computing (Q-HPC) workflows are emerging as a key strategy for running quantum applications at scale in current noisy intermediate-scale quantum (NISQ) devices.
ZKProphet: Understanding Performance of Zero-Knowledge Proofs on GPUsTarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin2025-09-17下载Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret.
Julia GraphBLAS with Nonblocking ExecutionPascal Costanza, Timothy G. Mattson, Raye Kimmerer, Benjamin Brock2025-09-17下载From the beginning, the GraphBLAS were designed for ``nonblocking execution''; i.e., calls to GraphBLAS methods return as soon as the arguments to the methods are validated and define a directed acycl...
A Closeness Centrality-based Circuit Partitioner for Quantum SimulationsDoru Thom Popovici, Harlin Lee, Mauro Del Ben, Naoki Yoshioka, Nobuyasu Ito, Katherine Klymko, Daan Camps, Anastasiia Butko2025-09-17下载Simulating quantum circuits (QC) on high-performance computing (HPC) systems has become an essential method to benchmark algorithms and probe the potential of large-scale quantum computation despite t...
LLM Agents for Interactive Workflow Provenance: Reference Architecture and Evaluation MethodologyRenan Souza, Timothy Poteet, Brian Etz, Daniel Rosendo, Amal Gueroudji, Woong Shin, Prasanna Balaprakash, Rafael Ferreira da Silva2025-09-17下载Modern scientific discovery increasingly relies on workflows that process data across the Edge, Cloud, and High Performance Computing (HPC) continuum.
Adaptive Client Selection via Q-Learning-based Whittle Index in Wireless Federated LearningQiyue Li, Yingxin Liu, Hang Qi, Jieping Luo, Zhizhang Liu, Jingjin Wu2025-09-17下载We consider the client selection problem in wireless Federated Learning (FL), with the objective of reducing the total required time to achieve a certain level of learning accuracy.
Graph-Regularized Learning of Gaussian Mixture ModelsShamsiiat Abdurakhmanova, Alex Jung2025-09-17下载We present a graph-regularized learning of Gaussian Mixture Models (GMMs) in distributed settings with heterogeneous and limited local data. The method exploits a provided similarity graph to guide pa...
ParaAegis: Parallel Protection for Flexible Privacy-preserved Federated LearningZihou Wu, Yuecheng Li, Tianchi Liao, Jian Lou, Chuan Chen2025-09-17下载Federated learning (FL) faces a critical dilemma: existing protection mechanisms like differential privacy (DP) and homomorphic encryption (HE) enforce a rigid trade-off, forcing a choice between mode...
FLAME: A Serving System Optimized for Large-Scale Generative Recommendation with EfficiencyXianwen Guo, Bin Huang, Xiaomeng Wu, Guanlin Wu, Fangjian Li, Shijia Wang, Qiang Xiao, Chuanjiang Luo, Yong Li2025-09-17下载Generative recommendation (GR) models possess greater scaling power compared to traditional deep learning recommendation models (DLRMs), yet they also impose a tremendous increase in computational bur...
GPU Programming for AI Workflow Development on AWS SageMaker: An Instructional ApproachSriram Srinivasan, Hamdan Alabsi, Rand Obeidat, Nithisha Ponnala, Azene Zenebe2025-09-17下载We present the design, implementation, and comprehensive evaluation of a specialized course on GPU architecture, GPU programming, and how these are used for developing AI agents.
Federated Learning for Deforestation Detection: A Distributed Approach with Satellite ImageryYuvraj Dutta, Aaditya Sikder, Basabdatta Palit2025-09-17下载Accurate identification of deforestation from satellite images is essential in order to understand the geographical situation of an area. This paper introduces a new distributed approach to identify a...
Secure, Scalable and Privacy Aware Data Strategy in CloudVijay Kumar Butte, Sujata Butte2025-09-17下载The enterprises today are faced with the tough challenge of processing, storing large amounts of data in a secure, scalable manner and enabling decision makers to make quick, informed data driven deci...
Taming Serverless Cold Starts Through OS Co-DesignBen Holmes, Baltasar Dinis, Lana Honcharuk, Joshua Fried, Adam Belay2025-09-17下载Serverless computing promises fine-grained elasticity and operational simplicity, fueling widespread interest from both industry and academia.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
TinyAC: Bringing Autonomic Computing Principles to Resource-Constrained SystemsWojciech Kalka, Ruitao Xue, Kamil Faber, Aleksander Slominski, Devki Jha, Rajiv Ranjan, Tomasz Szydlo2025-09-17下载Autonomic Computing (AC) is a promising approach for developing intelligent and adaptive self-management systems at the deep network edge. In this paper, we present the problems and challenges related...
Active Inference Framework for Closed-Loop Sensing, Communication, and Control in UAV SystemsGuangjin Pan, Liping Bai, Zhuojun Tian, Hui Chen, Mehdi Bennis, Henk Wymeersch2025-09-17下载Integrated sensing and communication (ISAC) is a core technology for 6G, and its application to closed-loop sensing, communication, and control (SCC) enables various services.
RepCaM++: Exploring Transparent Visual Prompt With Inference-Time Re-Parameterization for Neural Video DeliveryRongyu Zhang, Xize Duan, Jiaming Liu, Li Du, Yuan Du, Dan Wang, Shanghang Zhang, Fangxin Wang2025-09-17下载Recently, content-aware methods have been employed to reduce bandwidth and enhance the quality of Internet video delivery. These methods involve training distinct content-aware super-resolution (SR) m...
Path-Oblivious Entanglement Swapping for the Quantum InternetVincent Mutolo, Rhea Parekh, Dan Rubenstein2025-09-17下载Proposed Bell pair swapping protocols, an essential component of the Quantum Internet, are planned-path: specific, structured, routing paths are reserved prior to the execution of the swapping process...
Low-cost Highly-interoperable Multiplatform Campus Network: Experience of YARSI UniversitySurya Agustian, Sandra Permana, Salman Teguh Pratista, Syarifu Adam, Iswandi2025-09-17下载To some organizations, building campus network is sometimes considered to be very expensive; and this has made the project uneasy to perform. Moreover, if the organization without sufficient IT knowle...
Performance Evaluation of Intent-Based Networking Scenarios: A GitOps and Nephio ApproachSaptarshi Ghosh, Ioannis Mavromatis, Konstantinos Antonakoglou, Konstantinos Katsaros2025-09-17下载GitOps has emerged as a foundational paradigm for managing cloud-native infrastructures by enabling declarative configuration, version-controlled state, and automated reconciliation between intents an...
Domino: Dominant Path-based Compensation for Hardware Impairments in Modern WiFi SensingRuiqi Kong, He Chen2025-09-17下载WiFi sensing faces a critical reliability challenge due to hardware-induced RF distortions, especially with modern, market-dominant WiFi cards supporting 802.11ac/ax protocols.
A Survey and Evaluation Framework for Secure DNS ResolutionAli Sadeghi Jahromi, AbdelRahman Abdou, Paul C. van Oorschot2025-09-17下载Since security was not among the original design goals of the Domain Name System (herein called Vanilla DNS), many secure DNS schemes have been proposed to enhance the security and privacy of the DNS ...
Conducting Mission-Critical Voice Experiments with Automated Speech Recognition and CrowdsourcingJan Janak, Kahlil Dozier, Lauren Berny, Liang Hu, Dan Rubenstein, Charles Jennings, Henning Schulzrinne2025-09-17下载Mission-critical voice (MCV) communications systems have been a critical tool for the public safety community for over eight decades. Public safety users expect MCV systems to operate reliably and con...
LINC: An In-Network Coding Approach to Tame Packet Loss in Hybrid Wireless-Fiber BackbonesBenoit Pit-Claudel, Muriel Médard, Manya Ghobadi2025-09-17下载The emergence of ultra-low latency applications, such as financial transactions, has driven the development of hybrid backbone networks that rely on fiber, satellite, and microwave links.
A Framework for Multi-source Prefetching Through Adaptive WeightYoseph Berhanu Alebachew, Mulugeta Libsie2025-09-17下载The World Wide Web has come to be a great part of our daily life, yet user observed latency is still a problem that needs a proper means of handling.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
A Task Equalization Allocation Algorithm Incorporating Blocking Estimation and Resource Similarity Analysis for Vehicle Control Real-Time SystemsQianlong Duan, Bide Hao, Fan Zhou, Chen Fei, Shichun Yang2025-09-17下载In multi-core real-time vehicle control systems, synchronization blocking and resource contention pose critical challenges due to increasing task parallelism and shared resource access.
A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction CachingHenry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi2025-09-17下载Modern mobile CPU software pose challenges for conventional instruction cache replacement policies due to their complex runtime behavior causing high reuse distance between executions of the same inst...
Taming Serverless Cold Starts Through OS Co-DesignBen Holmes, Baltasar Dinis, Lana Honcharuk, Joshua Fried, Adam Belay2025-09-17下载Serverless computing promises fine-grained elasticity and operational simplicity, fueling widespread interest from both industry and academia.

cs.PF - Performance

标题作者发布日期PDF摘要
ZKProphet: Understanding Performance of Zero-Knowledge Proofs on GPUsTarunesh Verma, Yichao Yuan, Nishil Talati, Todd Austin2025-09-17下载Zero-Knowledge Proofs (ZKP) are protocols which construct cryptographic proofs to demonstrate knowledge of a secret input in a computation without revealing any information about the secret.
A TRRIP Down Memory Lane: Temperature-Based Re-Reference Interval Prediction For Instruction CachingHenry Kao, Nikhil Sreekumar, Prabhdeep Singh Soni, Ali Sedaghati, Fang Su, Bryan Chan, Maziar Goudarzi, Reza Azimi2025-09-17下载Modern mobile CPU software pose challenges for conventional instruction cache replacement policies due to their complex runtime behavior causing high reuse distance between executions of the same inst...

基于 VitePress 构建