2025-12-03

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Spiking Manifesto	Eugene Izhikevich	2025-12-03	下载	Practically everything computers do is better, faster, and more power-efficient than the brain. For example, a calculator performs numerical computations more energy-efficiently than any human.
A Spatial Array for Spectrally Agile Wireless Processing	Ali Rasteh, Andrew Hennessee, Ishaan Shivhare, Siddharth Garg, Sundeep Rangan, Brandon Reagen	2025-12-03	下载	Massive MIMO is a cornerstone of next-generation wireless communication, offering significant gains in capacity, reliability, and energy efficiency.
The BrainScaleS-2 multi-chip system: Interconnecting continuous-time neuromorphic compute substrates	Joscha Ilmberger, Johannes Schemmel	2025-12-03	下载	The BrainScaleS-2 SoC integrates analog neuron and synapse circuits with digital periphery, including two CPUs with SIMD extensions. Each ASIC is connected to a Node-FPGA, providing experiment control...
Lightweight Unified Sha-3/Shake Architecture with a Fault-Resilient State	Christian Ewert, Amrit Sharma Poudel, Mouadh Ayache, Andrija Neskovic, Rainer Buchty, Mladen Berekovic, Sebastian Berndt, Saleh Mulhem	2025-12-03	下载	Hash functions have become a key part of standard Post-quantum cryptography (PQC) schemes, especially Sha-3 and Shake, calling arXiv:submit/7045552 [cs.AR] 3 Dec 2025 for lightweight implementation.
KVNAND: Efficient On-Device Large Language Model Inference Using DRAM-Free In-Flash Computing	Lishuo Deng, Shaojie Xu, Jinwu Chen, Changwei Yan, Jiajie Wang, Zhe Jiang, Weiwei Shan	2025-12-03	下载	Deploying large language models (LLMs) on edge devices enables personalized agents with strong privacy and low cost. However, with tens to hundreds of billions of parameters, single-batch autoregressi...
Accelerating Detailed Routing Convergence through Offline Reinforcement Learning	Afsara Khan, Austin Rovinski	2025-12-03	下载	Detailed routing remains one of the most complex and time-consuming steps in modern physical design due to the challenges posed by shrinking feature sizes and stricter design rules.
In-Situ Encryption of Single-Transistor Nonvolatile Memories without Density Loss	Sanwar Ahmed Ovy, Jiahui Duan, Md Ashraful Islam Romel, Franz Muller, Thomas Kampfe, Kai Ni, Sumitha George	2025-12-03	下载	Non-volatile memories (NVMs) offer negligible leakage power consumption, high integration density, and data retention, but their non-volatility also raises the risk of data exposure.
Agentic Operator Generation for ML ASICs	Alec M. Hammond, Aram Markosyan, Aman Dontula, Simon Mahns, Zacharias Fisches, Dmitrii Pedchenko, Keyur Muzumdar, Natacha Supper, Mark Saroufim, Joe Isaacson, Laura Wang, Warren Hunt, Kaustubh Gondkar, Roman Levenstein, Gabriel Synnaeve, Richard Li, Jacob Kahn, Ajit Mathews	2025-12-03	下载	We present TritorX, an agentic AI system designed to generate functionally correct Triton PyTorch ATen kernels at scale for emerging accelerator platforms.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
VLCs: Managing Parallelism with Virtualized Libraries	Yineng Yan, William Ruys, Hochan Lee, Ian Henriksen, Arthur Peters, Sean Stephens, Bozhi You, Henrique Fingler, Martin Burtscher, Milos Gligoric, Keshav Pingali, Mattan Erez, George Biros, Christopher J. Rossbach	2025-12-03	下载	As the complexity and scale of modern parallel machines continue to grow, programmers increasingly rely on composition of software libraries to encapsulate and exploit parallelism.
Scaling MPI Applications on Aurora	Huda Ibeid, Anthony-Trung Nguyen, Aditya Nishtala, Premanand Sakarda, Larry Kaplan, Nilakantan Mahadevan, Michael Woodacre, Victor Anisimov, Kalyan Kumaran, JaeHyuk Kwack, Vitali Morozov, Servesh Muralidharan, Scott Parker	2025-12-03	下载	The Aurora supercomputer, which was deployed at Argonne National Laboratory in 2024, is currently one of three Exascale machines in the world on the Top500 list.
tritonBLAS: Triton-based Analytical Approach for GEMM Kernel Parameter Selection	Ryan Swann, Muhammad Osama, Xiaohu Guo, Bryant Nelson, Lixun Zhang, Alex Brown, Yen Ong, Ali Yazdani, Sean Siddens, Ganesh Dasika, Alex Underwood	2025-12-03	下载	We present tritonBLAS, a fast and deterministic analytical model that uses architectural parameters like the cache hierarchy, and relative code and data placement to generate performant GPU GEMM kerne...
A Chronological Analysis of the Evolution of SmartNICs	Olasupo Ajayi, Ryan Grant	2025-12-03	下载	Network Interface Cards (NICs) are one of the key enablers of the modern Internet. They serve as gateways for connecting computing devices to networks for the exchange of data with other devices.
OD-MoE: On-Demand Expert Loading for Cacheless Edge-Distributed MoE Inference	Liujianfu Wang, Yuyang Du, Yuchen Pan, Soung Chang Liew, Jiacheng Liu, Kexin Chen	2025-12-03	下载	Mixture-of-Experts (MoE), while offering significant advantages as a Large Language Model (LLM) architecture, faces substantial challenges when deployed on low-cost edge devices with tight memory cons...
Integrating High Performance In-Memory Data Streaming and In-Situ Visualization in Hybrid MPI+OpenMP PIC MC Simulations Towards Exascale	Jeremy J. Williams, Stefan Costea, Daniel Medeiros, Jordy Trilaksono, Pratibha Hegde, David Tskhakaya, Leon Kos, Ales Podolnik, Jakub Hromadka, Kevin A. Huck, Allen D. Malony, Frank Jenko, Erwin Laure, Stefano Markidis	2025-12-03	下载	Efficient simulation of complex plasma dynamics is crucial for advancing fusion energy research. Particle-in-Cell (PIC) Monte Carlo (MC) simulations provide insights into plasma behavior, including tu...
Seamless Transitions: A Comprehensive Review of Live Migration Technologies	Sima Attar-Khorasani, Lincoln Sherpa, Matthias Lieber, Siavash Ghiasvand	2025-12-03	下载	Live migration, a technology enabling seamless transition of operational computational entities between various hosts while preserving continuous functionality and client connectivity, has been the su...
Acceleration of Parallel Tempering for Markov Chain Monte Carlo methods	Aingeru Ramos, Jose A Pascual, Javier Navaridas, Ivan Coluzza	2025-12-03	下载	Markov Chain Monte Carlo methods are algorithms used to sample probability distributions, commonly used to sample the Boltzmann distribution of physical/chemical models (e.g.
On the Challenges of Energy-Efficiency Analysis in HPC Systems: Evaluating Synthetic Benchmarks and Gromacs	Rafael Ravedutti Lucio Machado, Jan Eitzinger, Georg Hager, Gerhard Wellein	2025-12-03	下载	This paper discusses the challenges encountered when analyzing the energy efficiency of synthetic benchmarks and the Gromacs package on the Fritz and Alex HPC clusters.
Distributed Quantum Computing with Fan-Out Operations and Qudits: the Case of Distributed Global Gates	Seng W. Loke	2025-12-03	下载	Much recent work on distributed quantum computing have focused on the use of entangled pairs and distributed two qubit gates. But there has also been work on efficient schemes for achieving multiparti...
FFTrainer: Fast Failover in Large-Language Model Training with Almost-Free State Management	Bohan Zhao, Yuanhong Wang, Chenglin Liu, Jiagi Pan, Guang Yang, Ruitao Liu, Tingrui Zhang, Kai Luo, Wei Xu	2025-12-03	下载	Recent developments in large language models (LLMs) have introduced new requirements for efficient and robust training. As LLM clusters scale, node failures, lengthy recoveries, and bulky checkpoints ...
Tuning of Vectorization Parameters for Molecular Dynamics Simulations in AutoPas	Luis Gall, Samuel James Newcome, Fabio Alexander Gratl, Markus Mühlhäußer, Manish Kumar Mishra, Hans-Joachim Bungartz	2025-12-03	下载	Molecular Dynamics simulations can help scientists to gather valuable insights for physical processes on an atomic scale. This work explores various techniques for SIMD vectorization to improve the pa...
Double-Edge-Assisted Computation Offloading and Resource Allocation for Space-Air-Marine Integrated Networks	Zhen Wang, Bin Lin, Qiang, Ye	2025-12-03	下载	In this paper, we propose a double-edge-assisted computation offloading and resource allocation scheme tailored for space-air-marine integrated networks (SAMINs).
Agentic Operator Generation for ML ASICs	Alec M. Hammond, Aram Markosyan, Aman Dontula, Simon Mahns, Zacharias Fisches, Dmitrii Pedchenko, Keyur Muzumdar, Natacha Supper, Mark Saroufim, Joe Isaacson, Laura Wang, Warren Hunt, Kaustubh Gondkar, Roman Levenstein, Gabriel Synnaeve, Richard Li, Jacob Kahn, Ajit Mathews	2025-12-03	下载	We present TritorX, an agentic AI system designed to generate functionally correct Triton PyTorch ATen kernels at scale for emerging accelerator platforms.
TokenScale: Timely and Accurate Autoscaling for Disaggregated LLM Serving with Token Velocity	Ruiqi Lai, Hongrui Liu, Chengzhi Lu, Zonghao Liu, Siyu Cao, Siyang Shao, Yixin Zhang, Luo Mai, Dmitrii Ustiugov	2025-12-03	下载	The architectural shift to prefill/decode (PD) disaggregation in LLM serving improves resource utilization but struggles with the bursty nature of modern workloads.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Meta-Continual Mobility Forecasting for Proactive Handover Prediction	Sasi Vardhan Reddy Mandapati	2025-12-03	下载	Short-term mobility forecasting is a core requirement for proactive handover (HO) in cellular networks. Real-world mobility is highly non-stationary: abrupt turns, rapid speed changes, and unpredictab...
Simulation of a Heterogeneous Quantum Network	Hayden Miller, Caitao Zhan, Michael Bishof, Joaquin Chung, Han Xu, Prem Kumar, Rajkumar Kettimuthu	2025-12-03	下载	Quantum networks are expected to be heterogeneous systems, combining distinct qubit platforms, photon wavelengths, and device timescales to achieve scalable, multiuser connectivity.
A Chronological Analysis of the Evolution of SmartNICs	Olasupo Ajayi, Ryan Grant	2025-12-03	下载	Network Interface Cards (NICs) are one of the key enablers of the modern Internet. They serve as gateways for connecting computing devices to networks for the exchange of data with other devices.
Tutorial on Large Language Model-Enhanced Reinforcement Learning for Wireless Networks	Lingyi Cai, Wenjie Fu, Yuxi Huang, Ruichen Zhang, Yinqiu Liu, Jiawen Kang, Zehui Xiong, Tao Jiang, Dusit Niyato, Xianbin Wang, Shiwen Mao, Xuemin Shen	2025-12-03	下载	Reinforcement Learning (RL) has shown remarkable success in enabling adaptive and data-driven optimization for various applications in wireless networks.
Machine Learning to Predict Slot Usage in TSCH Wireless Sensor Networks	Stefano Scanzio, Gabriele Formis, Tullio Facchinetti, Gianluca Cena	2025-12-03	下载	Wireless sensor networks (WSNs) are employed across a wide range of industrial applications where ultra-low power consumption is a critical prerequisite.
Performance Evaluation of Parallel Wi-Fi Redundancy with Deferral Techniques	Gianluca Cena, Pietro Chiavassa, Stefano Scanzio	2025-12-03	下载	Wireless communication is increasingly used in industrial environments, since it supports mobility of interconnected devices. Among the transmission technologies operating in unlicensed bands availabl...
Mobility Induced Sensitivity of UAV based Nodes to Jamming in Private 5G Airfield Networks An Experimental Study	Pavlo Mykytyn, Ronald Chitauro, Onur Yener, Peter Langendoerfer	2025-12-03	下载	This work presents an experimental performance evaluation of a private 5G airfield network under controlled directional SDR jamming attacks targeting UAV-based UE nodes.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
VLCs: Managing Parallelism with Virtualized Libraries	Yineng Yan, William Ruys, Hochan Lee, Ian Henriksen, Arthur Peters, Sean Stephens, Bozhi You, Henrique Fingler, Martin Burtscher, Milos Gligoric, Keshav Pingali, Mattan Erez, George Biros, Christopher J. Rossbach	2025-12-03	下载	As the complexity and scale of modern parallel machines continue to grow, programmers increasingly rely on composition of software libraries to encapsulate and exploit parallelism.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Integrating High Performance In-Memory Data Streaming and In-Situ Visualization in Hybrid MPI+OpenMP PIC MC Simulations Towards Exascale	Jeremy J. Williams, Stefan Costea, Daniel Medeiros, Jordy Trilaksono, Pratibha Hegde, David Tskhakaya, Leon Kos, Ales Podolnik, Jakub Hromadka, Kevin A. Huck, Allen D. Malony, Frank Jenko, Erwin Laure, Stefano Markidis	2025-12-03	下载	Efficient simulation of complex plasma dynamics is crucial for advancing fusion energy research. Particle-in-Cell (PIC) Monte Carlo (MC) simulations provide insights into plasma behavior, including tu...
Hyperdimensional Computing for Sustainable Manufacturing: An Initial Assessment	Danny Hoang, Anandkumar Patel, Ruimen Chen, Rajiv Malhotra, Farhad Imani	2025-12-03	下载	Smart manufacturing can significantly improve efficiency and reduce energy consumption, yet the energy demands of AI models may offset these gains.
Tuning of Vectorization Parameters for Molecular Dynamics Simulations in AutoPas	Luis Gall, Samuel James Newcome, Fabio Alexander Gratl, Markus Mühlhäußer, Manish Kumar Mishra, Hans-Joachim Bungartz	2025-12-03	下载	Molecular Dynamics simulations can help scientists to gather valuable insights for physical processes on an atomic scale. This work explores various techniques for SIMD vectorization to improve the pa...