2025-07-22

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
MTU: The Multifunction Tree Unit for Accelerating Zero-Knowledge Proofs	Jianqiao Mo, Alhad Daftardar, Joey Ah-Kiow, Kaiyue Guo, Benedikt Bünz, Siddharth Garg, Brandon Reagen	2025-07-22	下载	Zero-Knowledge Proofs (ZKPs) are critical for privacy-preserving techniques and verifiable computation. Many ZKP protocols rely on key kernels such as the SumCheck protocol and Merkle Tree commitments...
Custom Algorithm-based Fault Tolerance for Attention Layers in Transformers	Vasileios Titopoulos, Kosmas Alexandridis, Giorgos Dimitrakopoulos	2025-07-22	下载	Transformers and large language models (LLMs), powered by the attention mechanism, have transformed numerous AI applications, driving the need for specialized hardware accelerators.
Augmenting Von Neumann's Architecture for an Intelligent Future	Rajpreet Singh, Vidhi Kothari	2025-07-22	下载	This work presents a novel computer architecture that extends the Von Neumann model with a dedicated Reasoning Unit (RU) to enable native artificial general intelligence capabilities.
Optimization of DNN-based HSI Segmentation FPGA-based SoC for ADS: A Practical Approach	Jon Gutiérrez-Zaballa, Koldo Basterretxea, Javier Echanobe	2025-07-22	下载	The use of HSI for autonomous navigation is a promising research field aimed at improving the accuracy and robustness of detection, tracking, and scene understanding systems based on vision sensors.
Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory Processing	Chenqi Lin, Kang Yang, Tianshi Xu, Ling Liang, Yufei Wang, Zhaohui Chen, Runsheng Wang, Mingyu Gao, Meng Li	2025-07-22	下载	With the wide application of machine learning (ML), privacy concerns arise with user data as they may contain sensitive information. Privacy-preserving ML (PPML) based on cryptographic primitives has ...
ApproxGNN: A Pretrained GNN for Parameter Prediction in Design Space Exploration for Approximate Computing	Ondrej Vlcek, Vojtech Mrazek	2025-07-22	下载	Approximate computing offers promising energy efficiency benefits for error-tolerant applications, but discovering optimal approximations requires extensive design space exploration (DSE).
Hourglass Sorting: A novel parallel sorting algorithm and its implementation	Daniel Bascones, Borja Morcillo	2025-07-22	下载	Sorting is one of the fundamental problems in computer science. Playing a role in many processes, it has a lower complexity bound imposed by $\mathcal{O}(n\log{n})$ when executing on a sequential mach...
SVAgent: AI Agent for Hardware Security Verification Assertion	Rui Guo, Avinash Ayalasomayajula, Henian Li, Jingbo Zhou, Sujan Kumar Saha, Farimah Farahmandi	2025-07-22	下载	Verification using SystemVerilog assertions (SVA) is one of the most popular methods for detecting circuit design vulnerabilities. However, with the globalization of integrated circuit design and the ...
RealBench: Benchmarking Verilog Generation Models with Real-World IP Designs	Pengwei Jin, Di Huang, Chongxiao Li, Shuyao Cheng, Yang Zhao, Xinyao Zheng, Jiaguo Zhu, Shuyi Xing, Bohan Dou, Rui Zhang, Zidong Du, Qi Guo, Xing Hu	2025-07-22	下载	The automatic generation of Verilog code using Large Language Models (LLMs) has garnered significant interest in hardware design automation. However, existing benchmarks for evaluating LLMs in Verilog...
A Sparsity-Aware Autonomous Path Planning Accelerator with HW/SW Co-Design and Multi-Level Dataflow Optimization	Yifan Zhang, Xiaoyu Niu, Hongzheng Tian, Yanjun Zhang, Bo Yu, Shaoshan Liu, Sitao Huang	2025-07-22	下载	Path planning is critical for autonomous driving, generating smooth, collision-free, feasible paths based on perception and localization inputs.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU Systems	Imran Latif, Muhammad Ali Shafique, Hayat Ullah, Alex C. Newkirk, Xi Yu, Arslan Munir	2025-07-22	下载	The unprecedented growth in artificial intelligence (AI) workloads, recently dominated by large language models (LLMs) and vision-language models (VLMs), has intensified power and cooling demands in d...
Collaborative Inference and Learning between Edge SLMs and Cloud LLMs: A Survey of Algorithms, Execution, and Open Challenges	Senyao Li, Haozhao Wang, Wenchao Xu, Rui Zhang, Song Guo, Jingling Yuan, Xian Zhong, Tianwei Zhang, Ruixuan Li	2025-07-22	下载	As large language models (LLMs) evolve, deploying them solely in the cloud or compressing them for edge devices has become inadequate due to concerns about latency, privacy, cost, and personalization.
AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled Codebase	Andrei-Leonard Nicusan, Dominik Werner, Simon Branford, Simon Hartley, Andrew J. Morris, Kit Windows-Yule	2025-07-22	下载	AcceleratedKernels.jl is introduced as a backend-agnostic library for parallel computing in Julia, natively targeting NVIDIA, AMD, Intel, and Apple accelerators via a unique transpilation architecture...
FOGNITE: Federated Learning-Enhanced Fog-Cloud Architecture	Somayeh Sobati-M	2025-07-22	下载	Modern smart grids demand fast, intelligent, and energy-aware computing at the edge to manage real time fluctuations and ensure reliable operation.
An Experimental Study of Split-Learning TinyML on Ultra-Low-Power Edge/IoT Nodes	Zied Jenhani, Mounir Bensalem, Jasenka Dizdarević, Admela Jukan	2025-07-22	下载	Running deep learning inference directly on ultra-low-power edge/IoT nodes has been limited by the tight memory and compute budgets of microcontrollers.
Autonomous Dominant Resource Fairness for Blockchain Ecosystems	Serdar Metin	2025-07-22	下载	Blockchain systems have been a part of mainstream academic research, and a hot topic at that. It has spread to almost every subfield in the computer science literature, as well as economics and financ...
STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal Planning	Zixiao Huang, Junhao Hu, Hao Lin, Chunyang Zhu, Yueran Tang, Quanlu Zhang, Zhen Guo, Zhenhua Li, Shengen Yan, Zhenhua Zhu, Guohao Dai, Yu Wang	2025-07-22	下载	The rapid scaling of large language models (LLMs) has significantly increased GPU memory pressure, which is further aggravated by training optimization techniques such as virtual pipeline and recomput...
Improved Wake-Up Time For Euclidean Freeze-Tag Problem	Sharareh Alipour, Arash Ahadi, Kajal Baghestani	2025-07-22	下载	The Freeze-Tag Problem (FTP) involves activating a set of initially asleep robots as quickly as possible, starting from a single awake robot. Once activated, a robot can assist in waking up other robo...
Parallel Ray Tracing of Black Hole Images Using the Schwarzschild Metric	Liam Naddell, Marcelo Ponce	2025-07-22	下载	Rendering images of black holes by utilizing ray tracing techniques is a common methodology employed in many aspects of scientific and astrophysical visualizations.
DP2Guard: A Lightweight and Byzantine-Robust Privacy-Preserving Federated Learning Scheme for Industrial IoT	Baofu Han, Bing Li, Yining Qi, Raja Jurdak, Kaibin Huang, Chau Yuen	2025-07-22	下载	Privacy-Preserving Federated Learning (PPFL) has emerged as a secure distributed Machine Learning (ML) paradigm that aggregates locally trained gradients without exposing raw data.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Analysis of Post-Quantum Cryptography in User Equipment in 5G and Beyond	Sanzida Hoque, Abdullah Aydeger, Engin Zeydan, Madhusanka Liyanage	2025-07-22	下载	The advent of quantum computing threatens the security of classical public-key cryptographic systems, prompting the transition to post-quantum cryptography (PQC).
SoK: Securing the Final Frontier for Cybersecurity in Space-Based Infrastructure	Nafisa Anjum, Tasnuva Farheen	2025-07-22	下载	With the advent of modern technology, critical infrastructure, communications, and national security depend increasingly on space-based assets.
Latent Space Alignment for AI-Native MIMO Semantic Communications	Mario Edoardo Pandolfo, Simone Fiorellino, Emilio Calvanese Strinati, Paolo Di Lorenzo	2025-07-22	下载	Semantic communications focus on prioritizing the understanding of the meaning behind transmitted data and ensuring the successful completion of tasks that motivate the exchange of information.
An Experimental Study of Split-Learning TinyML on Ultra-Low-Power Edge/IoT Nodes	Zied Jenhani, Mounir Bensalem, Jasenka Dizdarević, Admela Jukan	2025-07-22	下载	Running deep learning inference directly on ultra-low-power edge/IoT nodes has been limited by the tight memory and compute budgets of microcontrollers.
The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic Classification	Yuqi Zhao, Giovanni Dettori, Matteo Boffa, Luca Vassio, Marco Mellia	2025-07-22	下载	Recently we have witnessed the explosion of proposals that, inspired by Language Models like BERT, exploit Representation Learning models to create traffic representations.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Analysis of Post-Quantum Cryptography in User Equipment in 5G and Beyond	Sanzida Hoque, Abdullah Aydeger, Engin Zeydan, Madhusanka Liyanage	2025-07-22	下载	The advent of quantum computing threatens the security of classical public-key cryptographic systems, prompting the transition to post-quantum cryptography (PQC).
AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled Codebase	Andrei-Leonard Nicusan, Dominik Werner, Simon Branford, Simon Hartley, Andrew J. Morris, Kit Windows-Yule	2025-07-22	下载	AcceleratedKernels.jl is introduced as a backend-agnostic library for parallel computing in Julia, natively targeting NVIDIA, AMD, Intel, and Apple accelerators via a unique transpilation architecture...
From Profiling to Optimization: Unveiling the Profile Guided Optimization	Bingxin Liu, Yinghui Huang, Jianhua Gao, Jianjun Shi, Yongpeng Liu, Yipin Sun, Weixing Ji	2025-07-22	下载	Profile Guided Optimization (PGO) uses runtime profiling to direct compiler optimization decisions, effectively combining static analysis with actual execution behavior to enhance performance.
STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal Planning	Zixiao Huang, Junhao Hu, Hao Lin, Chunyang Zhu, Yueran Tang, Quanlu Zhang, Zhen Guo, Zhenhua Li, Shengen Yan, Zhenhua Zhu, Guohao Dai, Yu Wang	2025-07-22	下载	The rapid scaling of large language models (LLMs) has significantly increased GPU memory pressure, which is further aggravated by training optimization techniques such as virtual pipeline and recomput...