Appearance
2025-08-22
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| zkPHIRE: A Programmable Accelerator for ZKPs over HIgh-degRee, Expressive Gates | Alhad Daftardar, Jianqiao Mo, Joey Ah-kiow, Benedikt Bünz, Siddharth Garg, Brandon Reagen | 2025-08-22 | 下载 | Zero-Knowledge Proofs (ZKPs) have emerged as a powerful tool for secure and privacy-preserving computation. ZKPs enable one party to convince another of a statement's validity without revealing anythi... |
| Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective | Tianyao Shi, Yi Ding | 2025-08-22 | 下载 | Large language models (LLMs) have demonstrated remarkable capabilities across diverse domains, but their heavy resource demands make quantization-reducing precision to lower-bit formats-critical for e... |
| RIROS: A Parallel RTL Fault SImulation FRamework with TwO-Dimensional Parallelism and Unified Schedule | Jiaping Tang, Jianan Mu, Zizhen Liu, Ge Yu, Tenghui Hua, Bin Sun, Silin Liu, Jing Ye, Huawei Li | 2025-08-22 | 下载 | With the rapid development of safety-critical applications such as autonomous driving and embodied intelligence, the functional safety of the corresponding electronic chips becomes more critical. |
| Hardwired-Neurons Language Processing Units as General-Purpose Cognitive Substrates | Yang Liu, Yi Chen, Yongwei Zhao, Yifan Hao, Zifu Zheng, Weihao Kong, Zhangmai Li, Dongchen Jiang, Ruiyang Xia, Zhihong Ma, Zisheng Liu, Zhaoyong Wan, Yunqi Lu, Ximing Liu, Hongrui Guo, Zhihao Yang, Zhe Wang, Tianrui Ma, Mo Zou, Rui Zhang, Ling Li, Xing Hu, Zidong Du, Zhiwei Xu, Qi Guo, Tianshi Chen, Yunji Chen | 2025-08-22 | 下载 | The rapid advancement of Large Language Models (LLMs) has established language as a core general-purpose cognitive substrate, driving the demand for specialized Language Processing Units (LPUs) tailor... |
| Bare-Metal RISC-V + NVDLA SoC for Efficient Deep Learning Inference | Vineet Kumar, Ajay Kumar M, Yike Li, Shreejith Shanker, Deepu John | 2025-08-22 | 下载 | This paper presents a novel System-on-Chip (SoC) architecture for accelerating complex deep learning models for edge computing applications through a combination of hardware and software optimisations... |
| GPT-OSS-20B: A Comprehensive Deployment-Centric Analysis of OpenAI's Open-Weight Mixture of Experts Model | Deepak Kumar, Divakar Yadav, Yash Patel | 2025-08-22 | 下载 | We present a single-GPU (H100, bf16) evaluation of GPT-OSS-20B (Mixture-of-Experts; 20.9B total, approx. 3.61B active) against dense baselines Qwen3-32B and Yi-34B across multiple dimensions. |
| HePGA: A Heterogeneous Processing-in-Memory based GNN Training Accelerator | Chukwufumnanya Ogbogu, Gaurav Narang, Biresh Kumar Joardar, Janardhan Rao Doppa, Krishnendu Chakrabarty, Partha Pratim Pande | 2025-08-22 | 下载 | Processing-In-Memory (PIM) architectures offer a promising approach to accelerate Graph Neural Network (GNN) training and inference. However, various PIM devices such as ReRAM, FeFET, PCM, MRAM, and S... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A User-centric Kubernetes-based Architecture for Green Cloud Computing | Matteo Zanotto, Leonardo Vicentini, Redi Vreto, Francesco Lumpp, Diego Braga, Sandro Fiore | 2025-08-22 | 下载 | To meet the increasing demand for cloud computing services, the scale and number of data centers keeps increasing worldwide. This growth comes at the cost of increased electricity consumption, which d... |
| PICO: Performance Insights for Collective Operations | Saverio Pasqualoni, Lorenzo Piarulli, Daniele De Sensi | 2025-08-22 | 下载 | Collective operations are cornerstones of both HPC application and large-scale AI training and inference. Yet, comprehensive, systematic and reproducible performance evaluation and benchmarking of sai... |
| Neuromorphic Simulation of Drosophila Melanogaster Brain Connectome on Loihi 2 | Felix Wang, Bradley H. Theilman, Fred Rothganger, William Severa, Craig M. Vineyard, James B. Aimone | 2025-08-22 | 下载 | We demonstrate the first-ever nontrivial, biologically realistic connectome simulated on neuromorphic computing hardware. Specifically, we implement the whole-brain connectome of the adult Drosophila ... |
| On the Duality of Task and Actor Programming Models | Rohan Yadav, Joseph Guman, Sean Treichler, Michael Garland, Alex Aiken, Fredrik Kjolstad, Michael Bauer | 2025-08-22 | 下载 | Programming models for distributed and heterogeneous machines are rapidly growing in popularity to meet the demands of modern workloads. Task and actor models are common choices that offer different t... |
| Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective | Tianyao Shi, Yi Ding | 2025-08-22 | 下载 | Large language models (LLMs) have demonstrated remarkable capabilities across diverse domains, but their heavy resource demands make quantization-reducing precision to lower-bit formats-critical for e... |
| Generalizing Brooks' theorem via Partial Coloring is Hard Classically and Locally | Jan Bok, Avinandan Das, Anna Gujgiczer, Nikola Jedličková | 2025-08-22 | 下载 | We investigate the classical and distributed complexity of \emph{-partial -coloring} where , a natural generalization of Brooks' theorem where each vertex should be colored from the palette... |
| Scalable hybrid quantum Monte Carlo simulation of U(1) gauge field coupled to fermions on GPU | Kexin Feng, Chuang Chen, Zi Yang Meng | 2025-08-22 | 下载 | We develop a GPU-accelerated hybrid quantum Monte Carlo (QMC) algorithm to solve the fundamental yet difficult problem of gauge field coupled to fermions, which gives rise to a Dirac spi... |
| Hybrid Classical-Quantum Supercomputing: A demonstration of a multi-user, multi-QPU and multi-GPU environment | Mateusz Slysz, Piotr Rydlichowski, Krzysztof Kurowski, Omar Bacarreza, Esperanza Cuenca Gomez, Zohim Chandani, Bettina Heim, Pradnya Khalate, William R. Clements, James Fletcher | 2025-08-22 | 下载 | Achieving a practical quantum advantage for near-term applications is widely expected to rely on hybrid classical-quantum algorithms. To deliver this practical advantage to users, high performance com... |
| Self-Healing Network of Interconnected Edge Devices Empowered by Infrastructure-as-Code and LoRa Communication | Rob Carson, Mohamed Chahine Ghanem, Feriel Bouakkaz | 2025-08-22 | 下载 | This Paper proposes a self-healing, automated network of Raspberry Pi devices designed for deployment in scenarios where traditional networking is unavailable. |
| GPT-OSS-20B: A Comprehensive Deployment-Centric Analysis of OpenAI's Open-Weight Mixture of Experts Model | Deepak Kumar, Divakar Yadav, Yash Patel | 2025-08-22 | 下载 | We present a single-GPU (H100, bf16) evaluation of GPT-OSS-20B (Mixture-of-Experts; 20.9B total, approx. 3.61B active) against dense baselines Qwen3-32B and Yi-34B across multiple dimensions. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| QoS-based Intelligent multi-connectivity for B5G networks | Ali Parsa, Neda Moghim, Sachin Shetty | 2025-08-22 | 下载 | The rapid advancement of communication technologies has established cellular networks as the backbone for diverse applications, each with distinct quality of service requirements. |
| Self-Healing Network of Interconnected Edge Devices Empowered by Infrastructure-as-Code and LoRa Communication | Rob Carson, Mohamed Chahine Ghanem, Feriel Bouakkaz | 2025-08-22 | 下载 | This Paper proposes a self-healing, automated network of Raspberry Pi devices designed for deployment in scenarios where traditional networking is unavailable. |
| Set Transformer Architectures and Synthetic Data Generation for Flow-Guided Nanoscale Localization | Mika Leo Hube, Filip Lemic, Ethungshan Shitiri, Gerard Calvo Bartra, Sergi Abadal, Xavier Costa Pérez | 2025-08-22 | 下载 | Flow-guided Localization (FGL) enables the identification of spatial regions within the human body that contain an event of diagnostic interest. |
| Joint Cache Placement and Routing in Satellite-Terrestrial Edge Computing Network: A GNN-Enabled DRL Approach | Yuhao Zheng, Ting You, Kejia Peng, Chang Liu | 2025-08-22 | 下载 | In this letter, we investigate the problem of joint content caching and routing in satellite-terrestrial edge computing networks (STECNs) to improve caching service for geographically distributed user... |
| ANSC: Probabilistic Capacity Health Scoring for Datacenter-Scale Reliability | Madhava Gaikwad, Abhishek Gandhi | 2025-08-22 | 下载 | We present ANSC, a probabilistic capacity health scoring framework for hyperscale datacenter fabrics. While existing alerting systems detect individual device or link failures, they do not capture the... |
| A Survey of Post-Quantum Cryptography Support in Cryptographic Libraries | Nadeem Ahmed, Lei Zhang, Aryya Gangopadhyay | 2025-08-22 | 下载 | The rapid advancement of quantum computing poses a significant threat to modern cryptographic systems, necessitating the transition to Post-Quantum Cryptography (PQC). |
| Congestion Control System Optimization with Large Language Models | Zhiyuan He, Aashish Gottipati, Lili Qiu, Yuqing Yang, Francis Y. Yan | 2025-08-22 | 下载 | Congestion control is a fundamental component of Internet infrastructure, and researchers have dedicated considerable effort to developing improved congestion control algorithms. |
| Time Series Based Network Intrusion Detection using MTF-Aided Transformer | Poorvi Joshi, Mohan Gurusamy | 2025-08-22 | 下载 | This paper introduces a novel approach to time series classification using a Markov Transition Field (MTF)-aided Transformer model, specifically designed for Software-Defined Networks (SDNs). |
| CoVeRaP: Cooperative Vehicular Perception through mmWave FMCW Radars | Jinyue Song, Hansol Ku, Jayneel Vora, Nelson Lee, Ahmad Kamari, Prasant Mohapatra, Parth Pathak | 2025-08-22 | 下载 | Automotive FMCW radars remain reliable in rain and glare, yet their sparse, noisy point clouds constrain 3-D object detection. We therefore release CoVeRaP, a 21 k-frame cooperative dataset that time-... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| PICO: Performance Insights for Collective Operations | Saverio Pasqualoni, Lorenzo Piarulli, Daniele De Sensi | 2025-08-22 | 下载 | Collective operations are cornerstones of both HPC application and large-scale AI training and inference. Yet, comprehensive, systematic and reproducible performance evaluation and benchmarking of sai... |
| GreenLLM: SLO-Aware Dynamic Frequency Scaling for Energy-Efficient LLM Serving | Qunyou Liu, Darong Huang, Marina Zapater, David Atienza | 2025-08-22 | 下载 | Large Language Models (LLMs) are becoming the backbone of modern cloud services, yet their inference costs are dominated by GPU energy. Unlike traditional GPU workloads, LLM inference has two stages w... |
| Systematic Characterization of LLM Quantization: A Performance, Energy, and Quality Perspective | Tianyao Shi, Yi Ding | 2025-08-22 | 下载 | Large language models (LLMs) have demonstrated remarkable capabilities across diverse domains, but their heavy resource demands make quantization-reducing precision to lower-bit formats-critical for e... |
| Two-Timescale Dynamic Service Deployment and Task Scheduling with Spatiotemporal Collaboration in Mobile Edge Networks | Yang Li, Xing Zhang, Yunji Zhao, Wenbo Wang | 2025-08-22 | 下载 | Collaborative edge computing addresses the resource constraints of individual edge nodes by enabling resource sharing and task co-processing across multiple nodes. |
| ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference | Wangsong Yin, Daliang Xu, Mengwei Xu, Gang Huang, Xuanzhe Liu | 2025-08-22 | 下载 | On-device running Large Language Models (LLMs) is nowadays a critical enabler towards preserving user privacy. We observe that the attention operator falls back from the special-purpose NPU to the gen... |
| GPT-OSS-20B: A Comprehensive Deployment-Centric Analysis of OpenAI's Open-Weight Mixture of Experts Model | Deepak Kumar, Divakar Yadav, Yash Patel | 2025-08-22 | 下载 | We present a single-GPU (H100, bf16) evaluation of GPT-OSS-20B (Mixture-of-Experts; 20.9B total, approx. 3.61B active) against dense baselines Qwen3-32B and Yi-34B across multiple dimensions. |