2026-01-28

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
REASON: Accelerating Probabilistic Logical Reasoning for Scalable Neuro-Symbolic Intelligence	Zishen Wan, Che-Kai Liu, Jiayi Qian, Hanchen Yang, Arijit Raychowdhury, Tushar Krishna	2026-01-28	下载	Neuro-symbolic AI systems integrate neural perception with symbolic reasoning to enable data-efficient, interpretable, and robust intelligence beyond purely neural models.
Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM Sampling	Binglei Lou, Haoran Wu, Yao Lai, Jiayi Nie, Can Xiao, Xuan Guo, Rika Antonova, Robert Mullins, Aaron Zhao	2026-01-28	下载	Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-ce...
GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering	Junseo Lee, Sangyun Jeon, Jungi Lee, Junyong Park, Jaewoong Sim	2026-01-28	下载	3D Gaussian Splatting has gained widespread adoption across diverse applications due to its exceptional rendering performance and visual quality.
VersaQ-3D: A Reconfigurable Accelerator Enabling Feed-Forward and Generalizable 3D Reconstruction via Versatile Quantization	Yipu Zhang, Jintao Cheng, Xingyu Liu, Zeyu Li, Carol Jingyi Li, Jin Wu, Lin Jiang, Yuan Xie, Jiang Xu, Wei Zhang	2026-01-28	下载	The Visual Geometry Grounded Transformer (VGGT) enables strong feed-forward 3D reconstruction without per-scene optimization. However, its billion-parameter scale creates high memory and compute deman...
SATA: Sparsity-Aware Scheduling for Selective Token Attention	Zhenkun Fan, Zishen Wan, Che-Kai Liu, Ashwin Sanjay Lele, Win-San Khwa, Bo Zhang, Meng-Fan Chang, Arijit Raychowdhury	2026-01-28	下载	Transformers have become the foundation of numerous state-of-the-art AI models across diverse domains, thanks to their powerful attention mechanism for modeling long-range dependencies.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Deep Reinforcement Learning for Fault-Adaptive Routing in Eisenstein-Jacobi Interconnection Topologies	Mohammad Walid Charrwi, Zaid Hussain	2026-01-28	下载	The increasing density of many-core architectures necessitates interconnection networks that are both high-performance and fault-resilient. Eisenstein-Jacobi (EJ) networks, with their symmetric 6-regu...
Agentic Fog: A Policy-driven Framework for Distributed Intelligence in Fog Computing	Saeed Akbar, Muhammad Waqas, Rahmat Ullah	2026-01-28	下载	Fog and edge computing require adaptive control schemes that can handle partial observability, severe latency requirements, and dynamically changing workloads.
SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated Learning	Dawit Kiros Redie, Reza Arablouei, Stefan Werner	2026-01-28	下载	Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalle...
Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM Sampling	Binglei Lou, Haoran Wu, Yao Lai, Jiayi Nie, Can Xiao, Xuan Guo, Rika Antonova, Robert Mullins, Aaron Zhao	2026-01-28	下载	Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-ce...
OnePiece: A Large-Scale Distributed Inference System with RDMA for Complex AI-Generated Content (AIGC) Workflows	June Chen, Neal Xu, Gragas Huang, Bok Zhou, Stephen Liu	2026-01-28	下载	The rapid growth of AI-generated content (AIGC) has enabled high-quality creative production across diverse domains, yet existing systems face critical inefficiencies in throughput, resource utilizati...
Syncopate: Efficient Multi-GPU AI Kernels via Automatic Chunk-Centric Compute-Communication Overlap	Xinwei Qiang, Yue Guan, Zhengding Hu, Keren Zhou, Yufei Ding, Adnan Aziz	2026-01-28	下载	Communication has become a first-order bottleneck in large-cale GPU workloads, and existing distributed compilers address it mainly by overlapping whole compute and communication kernels at the stream...
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads	Aleix Roca, Vicenç Beltran	2026-01-28	下载	The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads.
Meeting SLOs, Slashing Hours: Automated Enterprise LLM Optimization with OptiKIT	Nicholas Santavas, Kareem Eissa, Patrycja Cieplicka, Piotr Florek, Matteo Nulli, Stefan Vasilev, Seyyed Hadi Hashemi, Antonios Gasteratos, Shahram Khadivi	2026-01-28	下载	Enterprise LLM deployment faces a critical scalability challenge: organizations must optimize models systematically to scale AI initiatives within constrained compute budgets, yet the specialized expe...
Graph-Structured Deep Learning Framework for Multi-task Contention Identification with High-dimensional Metrics	Xiao Yang, Yinan Ni, Yuqi Tang, Zhimin Qiu, Chen Wang, Tingzhou Yuan	2026-01-28	下载	This study addresses the challenge of accurately identifying multi-task contention types in high-dimensional system environments and proposes a unified contention classification framework that integra...
LIFT: Byzantine Resilient Hub-Sampling	Mohamed Amine Legheraba, Nour Rachdi, Maria Gradinariu Potop-Butucaru, Sébastien Tixeuil	2026-01-28	下载	Recently, a novel peer sampling protocol, Elevator, was introduced to construct network topologies tailored for emerging decentralized applications such as federated learning and blockchain.
SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips	Jiahuan Yu, Mingtao Hu, Zichao Lin, Minjia Zhang	2026-01-28	下载	Large Language Model (LLM) serving faces a fundamental tension between stringent latency Service Level Objectives (SLOs) and limited GPU memory capacity.
StreamFusion: Scalable Sequence Parallelism for Distributed Inference of Diffusion Transformers on GPUs	Jiacheng Yang, Jun Wu, Yaoyao Ding, Zhiying Xu, Yida Wang, Gennady Pekhimenko	2026-01-28	下载	Diffusion Transformers (DiTs) have gained increasing adoption in high-quality image and video generation. As demand for higher-resolution images and longer videos increases, single-GPU inference becom...
Securing AI Agents in Cyber-Physical Systems: A Survey of Environmental Interactions, Deepfake Threats, and Defenses	Mohsen Hatami, Van Tuan Pham, Hozefa Lakadawala, Yu Chen	2026-01-28	下载	The increasing integration of AI agents into cyber-physical systems (CPS) introduces new security risks that extend beyond traditional cyber or physical threat models.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks	Khaleda Papry, Francesco Spinnato, Marco Fiore, Mirco Nanni, Israat Haque	2026-01-28	下载	As 5G networks continue to evolve to deliver high speed, low latency, and reliable communications, ensuring uninterrupted service has become increasingly critical.
Immersive Volumetric Video Playback: Near-RT Resource Allocation and O-RAN-based Implementation	Yao Wen, Luping Xiang, Kun Yang	2026-01-28	下载	Immersive volumetric video streaming in extended reality (XR) demands ultra-low motion-to-photon (MTP) latency, which conventional edge-centric architectures struggle to meet due to per-frame computat...
Dependable Connectivity for Industrial Wireless Communication Networks	Nurul Huda Mahmood, Onel L. A. Lopez, David Ruiz-Guirola, Frank Burkhardt, Mehdi Rasti, Matti Latva-aho	2026-01-28	下载	Dependability - a system's ability to consistently provide reliable services by ensuring safety and maintainability in the face of internal or external disruptions - is a fundamental requirement for i...
IoT Device Identification with Machine Learning: Common Pitfalls and Best Practices	Kahraman Kostas, Rabia Yasa Kostas	2026-01-28	下载	This paper critically examines the device identification process using machine learning, addressing common pitfalls in existing literature. We analyze the trade-offs between identification methods (un...
Pocket RAG: On-Device RAG for First Aid Guidance in Offline Mobile Environment	Dong Ho Kang, Hyunjoon Lee, Hyeonjeong Cha, Minkyu Choi, Sungsoo Lim	2026-01-28	下载	In disaster scenarios or remote areas, first responders often lose network connectivity when providing first aid. In such situations, server-based AI systems fail to provide critical guidance.
Proactive SFC Provisioning with Forecast-Driven DRL in Data Centers	Parisa Fard Moshiri, Poonam Lohan, Burak Kantarci, Emil Janulewicz	2026-01-28	下载	Service Function Chaining (SFC) requires efficient placement of Virtual Network Functions (VNFs) to satisfy diverse service requirements while maintaining high resource utilization in Data Centers (DC...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Meta-ROS: A Next-Generation Middleware Architecture for Adaptive and Scalable Robotic Systems	Anshul Ranjan, Anoosh Damodar, Neha Chougule, Dhruva S Nayak, Anantharaman P. N, Shylaja S S	2026-01-28	下载	The field of robotics faces significant challenges related to the complexity and interoperability of existing middleware frameworks, like ROS2, which can be difficult for new developers to adopt.
/dev/SDB: Software Defined Boot -- A novel standard for diskless booting anywhere and everywhere	Aditya Mitra, Hamza Haroon, Amaan Rais Shah, Mohammad Elham Rasooli, Bogdan Itsam Dorantes Nikolaev, Tuğçe Ballı	2026-01-28	下载	A computer is nothing but a device that processes the instructions supplied to it. However, as computers evolved, the instructions or codes started to be more complicated.
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads	Aleix Roca, Vicenç Beltran	2026-01-28	下载	The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
The Multiserver-Job Stochastic Recurrence Equation for Cloud Computing Performance Evaluation	Francois Baccelli, Diletta Olliaro, Marco Ajmone Marsan, Andrea Marin	2026-01-28	下载	We study the Multiserver-Job Queuing Model (MJQM) with general independent arrivals and service times under FCFS scheduling, using stochastic recurrence equations (SREs) and ergodic theory.
Colored Markov Modulated Fluid Queues	Benny Van Houdt	2026-01-28	下载	Markov-modulated fluid queues (MMFQs) are a powerful modeling framework for analyzing the performance of computer and communication systems. Their distinguishing feature is that the underlying Markov ...