Skip to content

2026-01-25

cs.AR - Architecture

标题作者发布日期PDF摘要
Memory-Efficient FPGA Implementation of Stochastic Simulated AnnealingDuckgyu Shin, Naoya Onizawa, Warren J. Gross, Takahiro Hanyu2026-01-25下载Simulated annealing (SA) is a well-known algorithm for solving combinatorial optimization problems. However, the computation time of SA increases rapidly, as the size of the problem grows.
Late Breaking Results: Boosting Efficient Dual-Issue Execution on Lightweight RISC-V CoresLuca Colagrande, Luca Benini2026-01-25下载Large-scale ML accelerators rely on large numbers of PEs, imposing strict bounds on the area and energy budget of each PE. Prior work demonstrates that limited dual-issue capabilities can be efficient...
@NTT: Algorithm-Targeted NTT hardware acceleration via Design-Time Constant OptimizationMohammed Nabeel, Mahmoud Hafez, Michail Maniatakos2026-01-25下载The Number Theoretic Transform (NTT) is a critical computational bottleneck in many lattice-based postquantum cryptographic (PQC) algorithms. By leveraging the Fast Fourier Transform (FFT) algorithm, ...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
MultiChain Blockchain Data Provenance for Deterministic Stream Processing with Kafka Streams: A Weather Data Case StudyNiaz Mohammad Ramaki, Florian Schintke2026-01-25下载Auditability and reproducibility still are critical challenges for real-time data streams pipelines. Streaming engines are highly dependent on runtime scheduling, window triggers, arrival orders, and ...
Types for Grassroots Logic ProgramsEhud Shapiro2026-01-25下载Grassroots Logic Programs (GLP) is a concurrent logic programming language in which logic variables are partitioned into paired readers and writers.
A Universal Load Balancing Principle and Its Application to Large Language Model ServingZixi Chen, Tianci Bu, Chendong Song, Xin Lu, Yinyu Ye, Zijie Zhou2026-01-25下载Over 40% of computational power in Large Language Model (LLM) serving systems can be systematically wasted - not from hardware limits, but from load imbalance in barrier-synchronized parallel processi...
On the Extension of Private Distributed Matrix Multiplication Schemes to the Grid PartitionChristoph Hofmeister, Razane Tajeddine, Antonia Wachter-Zeh, Rawad Bitar2026-01-25下载We consider polynomial codes for private distributed matrix multiplication (PDMM/SDMM). Existing codes for PDMM are either specialized for the outer product partitioning (OPP), or inner product partit...
CondenseGraph: Communication-Efficient Distributed GNN Training via On-the-Fly Graph CondensationZizhao Zhang, Yihan Xue, Haotian Zhu, Sijia Li, Zhijun Wang, Yujie Xiao2026-01-25下载Distributed Graph Neural Network (GNN) training suffers from substantial communication overhead due to the inherent neighborhood dependency in graph-structured data.
LLM-42: Enabling Determinism in LLM Inference with Verified SpeculationRaja Gond, Aditya K Kamath, Ramachandran Ramjee, Ashish Panwar2026-01-25下载In LLM inference, the same prompt may yield different outputs across different runs. At the system level, this non-determinism arises from floating-point non-associativity combined with dynamic batchi...
An MLIR Lowering Pipeline for Stencils at Wafer-ScaleNicolai Stawinoga, David Katz, Anton Lydike, Justs Zarins, Nick Brown, George Bisbas, Tobias Grosser2026-01-25下载The Cerebras Wafer-Scale Engine (WSE) delivers performance at an unprecedented scale of over 900,000 compute units, all connected via a single-wafer on-chip interconnect.
Faramesh: A Protocol-Agnostic Execution Control Plane for Autonomous Agent SystemsAmjad Fatmi2026-01-25下载Autonomous agent systems increasingly trigger real-world side effects: deploying infrastructure, modifying databases, moving money, and executing workflows.
Multi-core & GPU-based Balanced Butterfly Counting in Signed Bipartite GraphsMekala Kiran, Apurba Das, Suman Banerjee, Tathagata Ray2026-01-25下载Balanced butterfly counting, corresponding to counting balanced (2, 2)-bicliques, is a fundamental primitive in the analysis of signed bipartite graphs and provides a basis for studying higher-order s...
Kareus: Joint Reduction of Dynamic and Static Energy in Large Model TrainingRuofan Wu, Jae-Won Chung, Mosharaf Chowdhury2026-01-25下载The computing demand of AI is growing at an unprecedented rate, but energy supply is not keeping pace. As a result, energy has become an expensive, contended resource that requires explicit management...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Multi-Agent Collaborative Intrusion Detection for Low-Altitude Economy IoT: An LLM-Enhanced Agentic AI FrameworkHongjuan Li, Hui Kang, Jiahui Li, Geng Sun, Ruichen Zhang, Jiacheng Wang, Dusit Niyato, Wei Ni, Abbas Jamalipour2026-01-25下载The rapid expansion of low-altitude economy Internet of Things (LAE-IoT) networks has created unprecedented security challenges due to dynamic three-dimensional mobility patterns, distributed autonomo...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Credit Fairness: Online Fairness In Shared Resource PoolsSeyed Majid Zahedi, Rupert Freeman2026-01-25下载We consider a setting in which a group of agents share resources that must be allocated among them in each discrete time period. Agents have time-varying demands and derive constant marginal utility f...

基于 VitePress 构建