Skip to content

2026-04-04

cs.AR - Architecture

标题作者发布日期PDF摘要
Mambalaya: Einsum-Based Fusion Optimizations on State-Space ModelsToluwanimi O. Odemuyiwa, John D. Owens, Joel S. Emer, Michael Pellauer2026-04-04下载Mamba is an emerging, complex workload with various short-range and long-range dependencies, nonlinearities, and elementwise computations that are unable to run at near-peak speeds on modern hardware.
L-SPINE: A Low-Precision SIMD Spiking Neural Compute Engine for Resource-efficient Edge InferenceSonu Kumar, Mukul Lokhande, Santosh Kumar Vishvakarma2026-04-04下载Spiking Neural Networks (SNNs) offer a promising solution for energy-efficient edge intelligence; however, their hardware deployment is constrained by memory overhead, inefficient scaling operations, ...
Efficient Solving for Dynamic Data Structure Constraint Satisfaction ProblemNanbing Li, Weijie Peng, Jin Luo, Shuai Wang, Yihui Li, Jun Fang, Yun Liang2026-04-04下载Functional verification plays a central role in ensuring the correctness of modern integrated circuit designs, where constrained-random verification is widely adopted to generate diverse stimuli under...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
SecureAFL: Secure Asynchronous Federated LearningAnjun Gao, Feng Wang, Zhenglin Wan, Yueyang Quan, Zhuqing Liu, Minghong Fang2026-04-04下载Federated learning (FL) enables multiple clients to collaboratively train a global machine learning model via a server without sharing their private training data.
GPU-Accelerated Quantum Simulation: Empirical Backend Selection, Gate Fusion, and Adaptive PrecisionPoornima Kumaresan, Pavithra Muruganantham, Lakshmi Rajendran, Santhosh Sivasubramani2026-04-04下载Classical simulation of quantum circuits remains indispensable for algorithm development, hardware validation, and error analysis in the noisy intermediate-scale quantum (NISQ) era.
DéjàVu: A Minimalistic Mechanism for Distributed Plurality ConsensusFrancesco d'Amore, Niccolò D'Archivio, George Giakkoupis, Frédéric Giroire, Emanuele Natale2026-04-04下载We study the plurality consensus problem in distributed systems where a population of extremely simple agents, each initially holding one of k opinions, aims to agree on the initially most frequent on...
Minos: Systematically Classifying Performance and Power Characteristics of GPU Workloads on HPC ClustersRutwik Jain, Yiwei Jiang, Matthew D. Sinclair, Shivaraman Venkataraman2026-04-04下载As large-scale HPC compute clusters increasingly adopt accelerators such as GPUs to meet the voracious demands of modern workloads, these clusters are increasingly becoming power constrained.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Latency-Aware Resource Allocation over Heterogeneous Networks: A Lorentz-Invariant Market MechanismSaad Alqithami2026-04-04下载We present a telecom-native auction mechanism for allocating bandwidth and time slots across heterogeneous-delay networks, ranging from low-Earth-orbit (LEO) satellite constellations to delay-tolerant...
Graduated Trust Gating for IoT Location Verification: Trading Off Detection and Proof EscalationYoshiyuki Ootani2026-04-04下载IoT location services accept client-reported GPS coordinates at face value, yet spoofing is trivial with consumer-grade tools. Existing spoofing detectors output a binary decision, forcing system desi...
CCA Reimagined: An Exploratory Study of Large Language Models for Congestion ControlXiaoxuan Qin, Yufei Wang, Longfei Shangguan2026-04-04下载In this paper, we conduct an emulation-guided study to systematically investigate the feasibility of Large language model (LLM)-driven congestion control.
CB-VER: A Stable Foundation for Modular Control Plane VerificationDexin Zhang, Timothy Alberdingk Thijm, David Walker, Aarti Gupta2026-04-04下载Network operators are often interested in verifying \emph{eventually-stable properties} of network control planes: properties of control plane states that hold eventually, and hold forever thereafter,...

cs.PF - Performance

标题作者发布日期PDF摘要
Performance Evaluation of Subroutines Call in PHPYordan Kalmukov2026-04-04下载One of the most popular and basic principles in programming is the DRY principle (don't repeat yourself). According to it, code duplication should be avoided within a single application.
Minos: Systematically Classifying Performance and Power Characteristics of GPU Workloads on HPC ClustersRutwik Jain, Yiwei Jiang, Matthew D. Sinclair, Shivaraman Venkataraman2026-04-04下载As large-scale HPC compute clusters increasingly adopt accelerators such as GPUs to meet the voracious demands of modern workloads, these clusters are increasingly becoming power constrained.
From 8 Seconds to 370ms: Kernel-Fused SAR Imaging on Apple Silicon via Single-Dispatch FFT PipelinesMohamed Amine Bergach2026-04-04下载We present the first kernel-fused SAR Range Doppler pipeline on any GPU platform. By fusing FFT, matched-filter multiply, and IFFT into a single Metal compute dispatch -- keeping all intermediate data...

基于 VitePress 构建