Skip to content

2025-07-22

cs.AR - Architecture

标题作者发布日期PDF摘要
MTU: The Multifunction Tree Unit for Accelerating Zero-Knowledge ProofsJianqiao Mo, Alhad Daftardar, Joey Ah-Kiow, Kaiyue Guo, Benedikt Bünz, Siddharth Garg, Brandon Reagen2025-07-22下载Zero-Knowledge Proofs (ZKPs) are critical for privacy-preserving techniques and verifiable computation. Many ZKP protocols rely on key kernels such as the SumCheck protocol and Merkle Tree commitments...
Custom Algorithm-based Fault Tolerance for Attention Layers in TransformersVasileios Titopoulos, Kosmas Alexandridis, Giorgos Dimitrakopoulos2025-07-22下载Transformers and large language models (LLMs), powered by the attention mechanism, have transformed numerous AI applications, driving the need for specialized hardware accelerators.
Augmenting Von Neumann's Architecture for an Intelligent FutureRajpreet Singh, Vidhi Kothari2025-07-22下载This work presents a novel computer architecture that extends the Von Neumann model with a dedicated Reasoning Unit (RU) to enable native artificial general intelligence capabilities.
Optimization of DNN-based HSI Segmentation FPGA-based SoC for ADS: A Practical ApproachJon Gutiérrez-Zaballa, Koldo Basterretxea, Javier Echanobe2025-07-22下载The use of HSI for autonomous navigation is a promising research field aimed at improving the accuracy and robustness of detection, tracking, and scene understanding systems based on vision sensors.
Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory ProcessingChenqi Lin, Kang Yang, Tianshi Xu, Ling Liang, Yufei Wang, Zhaohui Chen, Runsheng Wang, Mingyu Gao, Meng Li2025-07-22下载With the wide application of machine learning (ML), privacy concerns arise with user data as they may contain sensitive information. Privacy-preserving ML (PPML) based on cryptographic primitives has ...
ApproxGNN: A Pretrained GNN for Parameter Prediction in Design Space Exploration for Approximate ComputingOndrej Vlcek, Vojtech Mrazek2025-07-22下载Approximate computing offers promising energy efficiency benefits for error-tolerant applications, but discovering optimal approximations requires extensive design space exploration (DSE).
Hourglass Sorting: A novel parallel sorting algorithm and its implementationDaniel Bascones, Borja Morcillo2025-07-22下载Sorting is one of the fundamental problems in computer science. Playing a role in many processes, it has a lower complexity bound imposed by O(nlogn)\mathcal{O}(n\log{n}) when executing on a sequential mach...
SVAgent: AI Agent for Hardware Security Verification AssertionRui Guo, Avinash Ayalasomayajula, Henian Li, Jingbo Zhou, Sujan Kumar Saha, Farimah Farahmandi2025-07-22下载Verification using SystemVerilog assertions (SVA) is one of the most popular methods for detecting circuit design vulnerabilities. However, with the globalization of integrated circuit design and the ...
RealBench: Benchmarking Verilog Generation Models with Real-World IP DesignsPengwei Jin, Di Huang, Chongxiao Li, Shuyao Cheng, Yang Zhao, Xinyao Zheng, Jiaguo Zhu, Shuyi Xing, Bohan Dou, Rui Zhang, Zidong Du, Qi Guo, Xing Hu2025-07-22下载The automatic generation of Verilog code using Large Language Models (LLMs) has garnered significant interest in hardware design automation. However, existing benchmarks for evaluating LLMs in Verilog...
A Sparsity-Aware Autonomous Path Planning Accelerator with HW/SW Co-Design and Multi-Level Dataflow OptimizationYifan Zhang, Xiaoyu Niu, Hongzheng Tian, Yanjun Zhang, Bo Yu, Shaoshan Liu, Sitao Huang2025-07-22下载Path planning is critical for autonomous driving, generating smooth, collision-free, feasible paths based on perception and localization inputs.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Cooling Matters: Benchmarking Large Language Models and Vision-Language Models on Liquid-Cooled Versus Air-Cooled H100 GPU SystemsImran Latif, Muhammad Ali Shafique, Hayat Ullah, Alex C. Newkirk, Xi Yu, Arslan Munir2025-07-22下载The unprecedented growth in artificial intelligence (AI) workloads, recently dominated by large language models (LLMs) and vision-language models (VLMs), has intensified power and cooling demands in d...
Collaborative Inference and Learning between Edge SLMs and Cloud LLMs: A Survey of Algorithms, Execution, and Open ChallengesSenyao Li, Haozhao Wang, Wenchao Xu, Rui Zhang, Song Guo, Jingling Yuan, Xian Zhong, Tianwei Zhang, Ruixuan Li2025-07-22下载As large language models (LLMs) evolve, deploying them solely in the cloud or compressing them for edge devices has become inadequate due to concerns about latency, privacy, cost, and personalization.
AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled CodebaseAndrei-Leonard Nicusan, Dominik Werner, Simon Branford, Simon Hartley, Andrew J. Morris, Kit Windows-Yule2025-07-22下载AcceleratedKernels.jl is introduced as a backend-agnostic library for parallel computing in Julia, natively targeting NVIDIA, AMD, Intel, and Apple accelerators via a unique transpilation architecture...
FOGNITE: Federated Learning-Enhanced Fog-Cloud ArchitectureSomayeh Sobati-M2025-07-22下载Modern smart grids demand fast, intelligent, and energy-aware computing at the edge to manage real time fluctuations and ensure reliable operation.
An Experimental Study of Split-Learning TinyML on Ultra-Low-Power Edge/IoT NodesZied Jenhani, Mounir Bensalem, Jasenka Dizdarević, Admela Jukan2025-07-22下载Running deep learning inference directly on ultra-low-power edge/IoT nodes has been limited by the tight memory and compute budgets of microcontrollers.
Autonomous Dominant Resource Fairness for Blockchain EcosystemsSerdar Metin2025-07-22下载Blockchain systems have been a part of mainstream academic research, and a hot topic at that. It has spread to almost every subfield in the computer science literature, as well as economics and financ...
STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal PlanningZixiao Huang, Junhao Hu, Hao Lin, Chunyang Zhu, Yueran Tang, Quanlu Zhang, Zhen Guo, Zhenhua Li, Shengen Yan, Zhenhua Zhu, Guohao Dai, Yu Wang2025-07-22下载The rapid scaling of large language models (LLMs) has significantly increased GPU memory pressure, which is further aggravated by training optimization techniques such as virtual pipeline and recomput...
Improved Wake-Up Time For Euclidean Freeze-Tag ProblemSharareh Alipour, Arash Ahadi, Kajal Baghestani2025-07-22下载The Freeze-Tag Problem (FTP) involves activating a set of initially asleep robots as quickly as possible, starting from a single awake robot. Once activated, a robot can assist in waking up other robo...
Parallel Ray Tracing of Black Hole Images Using the Schwarzschild MetricLiam Naddell, Marcelo Ponce2025-07-22下载Rendering images of black holes by utilizing ray tracing techniques is a common methodology employed in many aspects of scientific and astrophysical visualizations.
DP2Guard: A Lightweight and Byzantine-Robust Privacy-Preserving Federated Learning Scheme for Industrial IoTBaofu Han, Bing Li, Yining Qi, Raja Jurdak, Kaibin Huang, Chau Yuen2025-07-22下载Privacy-Preserving Federated Learning (PPFL) has emerged as a secure distributed Machine Learning (ML) paradigm that aggregates locally trained gradients without exposing raw data.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Analysis of Post-Quantum Cryptography in User Equipment in 5G and BeyondSanzida Hoque, Abdullah Aydeger, Engin Zeydan, Madhusanka Liyanage2025-07-22下载The advent of quantum computing threatens the security of classical public-key cryptographic systems, prompting the transition to post-quantum cryptography (PQC).
SoK: Securing the Final Frontier for Cybersecurity in Space-Based InfrastructureNafisa Anjum, Tasnuva Farheen2025-07-22下载With the advent of modern technology, critical infrastructure, communications, and national security depend increasingly on space-based assets.
Latent Space Alignment for AI-Native MIMO Semantic CommunicationsMario Edoardo Pandolfo, Simone Fiorellino, Emilio Calvanese Strinati, Paolo Di Lorenzo2025-07-22下载Semantic communications focus on prioritizing the understanding of the meaning behind transmitted data and ensuring the successful completion of tasks that motivate the exchange of information.
An Experimental Study of Split-Learning TinyML on Ultra-Low-Power Edge/IoT NodesZied Jenhani, Mounir Bensalem, Jasenka Dizdarević, Admela Jukan2025-07-22下载Running deep learning inference directly on ultra-low-power edge/IoT nodes has been limited by the tight memory and compute budgets of microcontrollers.
The Sweet Danger of Sugar: Debunking Representation Learning for Encrypted Traffic ClassificationYuqi Zhao, Giovanni Dettori, Matteo Boffa, Luca Vassio, Marco Mellia2025-07-22下载Recently we have witnessed the explosion of proposals that, inspired by Language Models like BERT, exploit Representation Learning models to create traffic representations.

cs.PF - Performance

标题作者发布日期PDF摘要
Analysis of Post-Quantum Cryptography in User Equipment in 5G and BeyondSanzida Hoque, Abdullah Aydeger, Engin Zeydan, Madhusanka Liyanage2025-07-22下载The advent of quantum computing threatens the security of classical public-key cryptographic systems, prompting the transition to post-quantum cryptography (PQC).
AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled CodebaseAndrei-Leonard Nicusan, Dominik Werner, Simon Branford, Simon Hartley, Andrew J. Morris, Kit Windows-Yule2025-07-22下载AcceleratedKernels.jl is introduced as a backend-agnostic library for parallel computing in Julia, natively targeting NVIDIA, AMD, Intel, and Apple accelerators via a unique transpilation architecture...
From Profiling to Optimization: Unveiling the Profile Guided OptimizationBingxin Liu, Yinghui Huang, Jianhua Gao, Jianjun Shi, Yongpeng Liu, Yipin Sun, Weixing Ji2025-07-22下载Profile Guided Optimization (PGO) uses runtime profiling to direct compiler optimization decisions, effectively combining static analysis with actual execution behavior to enhance performance.
STAlloc: Enhancing Memory Efficiency in Large-Scale Model Training with Spatio-Temporal PlanningZixiao Huang, Junhao Hu, Hao Lin, Chunyang Zhu, Yueran Tang, Quanlu Zhang, Zhen Guo, Zhenhua Li, Shengen Yan, Zhenhua Zhu, Guohao Dai, Yu Wang2025-07-22下载The rapid scaling of large language models (LLMs) has significantly increased GPU memory pressure, which is further aggravated by training optimization techniques such as virtual pipeline and recomput...

基于 VitePress 构建