Skip to content

2026-01-28

cs.AR - Architecture

标题作者发布日期PDF摘要
REASON: Accelerating Probabilistic Logical Reasoning for Scalable Neuro-Symbolic IntelligenceZishen Wan, Che-Kai Liu, Jiayi Qian, Hanchen Yang, Arijit Raychowdhury, Tushar Krishna2026-01-28下载Neuro-symbolic AI systems integrate neural perception with symbolic reasoning to enable data-efficient, interpretable, and robust intelligence beyond purely neural models.
Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM SamplingBinglei Lou, Haoran Wu, Yao Lai, Jiayi Nie, Can Xiao, Xuan Guo, Rika Antonova, Robert Mullins, Aaron Zhao2026-01-28下载Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-ce...
GRTX: Efficient Ray Tracing for 3D Gaussian-Based RenderingJunseo Lee, Sangyun Jeon, Jungi Lee, Junyong Park, Jaewoong Sim2026-01-28下载3D Gaussian Splatting has gained widespread adoption across diverse applications due to its exceptional rendering performance and visual quality.
VersaQ-3D: A Reconfigurable Accelerator Enabling Feed-Forward and Generalizable 3D Reconstruction via Versatile QuantizationYipu Zhang, Jintao Cheng, Xingyu Liu, Zeyu Li, Carol Jingyi Li, Jin Wu, Lin Jiang, Yuan Xie, Jiang Xu, Wei Zhang2026-01-28下载The Visual Geometry Grounded Transformer (VGGT) enables strong feed-forward 3D reconstruction without per-scene optimization. However, its billion-parameter scale creates high memory and compute deman...
SATA: Sparsity-Aware Scheduling for Selective Token AttentionZhenkun Fan, Zishen Wan, Che-Kai Liu, Ashwin Sanjay Lele, Win-San Khwa, Bo Zhang, Meng-Fan Chang, Arijit Raychowdhury2026-01-28下载Transformers have become the foundation of numerous state-of-the-art AI models across diverse domains, thanks to their powerful attention mechanism for modeling long-range dependencies.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Deep Reinforcement Learning for Fault-Adaptive Routing in Eisenstein-Jacobi Interconnection TopologiesMohammad Walid Charrwi, Zaid Hussain2026-01-28下载The increasing density of many-core architectures necessitates interconnection networks that are both high-performance and fault-resilient. Eisenstein-Jacobi (EJ) networks, with their symmetric 6-regu...
Agentic Fog: A Policy-driven Framework for Distributed Intelligence in Fog ComputingSaeed Akbar, Muhammad Waqas, Rahmat Ullah2026-01-28下载Fog and edge computing require adaptive control schemes that can handle partial observability, severe latency requirements, and dynamically changing workloads.
SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated LearningDawit Kiros Redie, Reza Arablouei, Stefan Werner2026-01-28下载Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalle...
Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM SamplingBinglei Lou, Haoran Wu, Yao Lai, Jiayi Nie, Can Xiao, Xuan Guo, Rika Antonova, Robert Mullins, Aaron Zhao2026-01-28下载Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-ce...
OnePiece: A Large-Scale Distributed Inference System with RDMA for Complex AI-Generated Content (AIGC) WorkflowsJune Chen, Neal Xu, Gragas Huang, Bok Zhou, Stephen Liu2026-01-28下载The rapid growth of AI-generated content (AIGC) has enabled high-quality creative production across diverse domains, yet existing systems face critical inefficiencies in throughput, resource utilizati...
Syncopate: Efficient Multi-GPU AI Kernels via Automatic Chunk-Centric Compute-Communication OverlapXinwei Qiang, Yue Guan, Zhengding Hu, Keren Zhou, Yufei Ding, Adnan Aziz2026-01-28下载Communication has become a first-order bottleneck in large-cale GPU workloads, and existing distributed compilers address it mainly by overlapping whole compute and communication kernels at the stream...
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process WorkloadsAleix Roca, Vicenç Beltran2026-01-28下载The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads.
Meeting SLOs, Slashing Hours: Automated Enterprise LLM Optimization with OptiKITNicholas Santavas, Kareem Eissa, Patrycja Cieplicka, Piotr Florek, Matteo Nulli, Stefan Vasilev, Seyyed Hadi Hashemi, Antonios Gasteratos, Shahram Khadivi2026-01-28下载Enterprise LLM deployment faces a critical scalability challenge: organizations must optimize models systematically to scale AI initiatives within constrained compute budgets, yet the specialized expe...
Graph-Structured Deep Learning Framework for Multi-task Contention Identification with High-dimensional MetricsXiao Yang, Yinan Ni, Yuqi Tang, Zhimin Qiu, Chen Wang, Tingzhou Yuan2026-01-28下载This study addresses the challenge of accurately identifying multi-task contention types in high-dimensional system environments and proposes a unified contention classification framework that integra...
LIFT: Byzantine Resilient Hub-SamplingMohamed Amine Legheraba, Nour Rachdi, Maria Gradinariu Potop-Butucaru, Sébastien Tixeuil2026-01-28下载Recently, a novel peer sampling protocol, Elevator, was introduced to construct network topologies tailored for emerging decentralized applications such as federated learning and blockchain.
SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on SuperchipsJiahuan Yu, Mingtao Hu, Zichao Lin, Minjia Zhang2026-01-28下载Large Language Model (LLM) serving faces a fundamental tension between stringent latency Service Level Objectives (SLOs) and limited GPU memory capacity.
StreamFusion: Scalable Sequence Parallelism for Distributed Inference of Diffusion Transformers on GPUsJiacheng Yang, Jun Wu, Yaoyao Ding, Zhiying Xu, Yida Wang, Gennady Pekhimenko2026-01-28下载Diffusion Transformers (DiTs) have gained increasing adoption in high-quality image and video generation. As demand for higher-resolution images and longer videos increases, single-GPU inference becom...
Securing AI Agents in Cyber-Physical Systems: A Survey of Environmental Interactions, Deepfake Threats, and DefensesMohsen Hatami, Van Tuan Pham, Hozefa Lakadawala, Yu Chen2026-01-28下载The increasing integration of AI agents into cyber-physical systems (CPS) introduces new security risks that extend beyond traditional cyber or physical threat models.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
An Explainable Failure Prediction Framework for Neural Networks in Radio Access NetworksKhaleda Papry, Francesco Spinnato, Marco Fiore, Mirco Nanni, Israat Haque2026-01-28下载As 5G networks continue to evolve to deliver high speed, low latency, and reliable communications, ensuring uninterrupted service has become increasingly critical.
Immersive Volumetric Video Playback: Near-RT Resource Allocation and O-RAN-based ImplementationYao Wen, Luping Xiang, Kun Yang2026-01-28下载Immersive volumetric video streaming in extended reality (XR) demands ultra-low motion-to-photon (MTP) latency, which conventional edge-centric architectures struggle to meet due to per-frame computat...
Dependable Connectivity for Industrial Wireless Communication NetworksNurul Huda Mahmood, Onel L. A. Lopez, David Ruiz-Guirola, Frank Burkhardt, Mehdi Rasti, Matti Latva-aho2026-01-28下载Dependability - a system's ability to consistently provide reliable services by ensuring safety and maintainability in the face of internal or external disruptions - is a fundamental requirement for i...
IoT Device Identification with Machine Learning: Common Pitfalls and Best PracticesKahraman Kostas, Rabia Yasa Kostas2026-01-28下载This paper critically examines the device identification process using machine learning, addressing common pitfalls in existing literature. We analyze the trade-offs between identification methods (un...
Pocket RAG: On-Device RAG for First Aid Guidance in Offline Mobile EnvironmentDong Ho Kang, Hyunjoon Lee, Hyeonjeong Cha, Minkyu Choi, Sungsoo Lim2026-01-28下载In disaster scenarios or remote areas, first responders often lose network connectivity when providing first aid. In such situations, server-based AI systems fail to provide critical guidance.
Proactive SFC Provisioning with Forecast-Driven DRL in Data CentersParisa Fard Moshiri, Poonam Lohan, Burak Kantarci, Emil Janulewicz2026-01-28下载Service Function Chaining (SFC) requires efficient placement of Virtual Network Functions (VNFs) to satisfy diverse service requirements while maintaining high resource utilization in Data Centers (DC...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Meta-ROS: A Next-Generation Middleware Architecture for Adaptive and Scalable Robotic SystemsAnshul Ranjan, Anoosh Damodar, Neha Chougule, Dhruva S Nayak, Anantharaman P. N, Shylaja S S2026-01-28下载The field of robotics faces significant challenges related to the complexity and interoperability of existing middleware frameworks, like ROS2, which can be difficult for new developers to adopt.
/dev/SDB: Software Defined Boot -- A novel standard for diskless booting anywhere and everywhereAditya Mitra, Hamza Haroon, Amaan Rais Shah, Mohammad Elham Rasooli, Bogdan Itsam Dorantes Nikolaev, Tuğçe Ballı2026-01-28下载A computer is nothing but a device that processes the instructions supplied to it. However, as computers evolved, the instructions or codes started to be more complicated.
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process WorkloadsAleix Roca, Vicenç Beltran2026-01-28下载The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads.

cs.PF - Performance

标题作者发布日期PDF摘要
The Multiserver-Job Stochastic Recurrence Equation for Cloud Computing Performance EvaluationFrancois Baccelli, Diletta Olliaro, Marco Ajmone Marsan, Andrea Marin2026-01-28下载We study the Multiserver-Job Queuing Model (MJQM) with general independent arrivals and service times under FCFS scheduling, using stochastic recurrence equations (SREs) and ergodic theory.
Colored Markov Modulated Fluid QueuesBenny Van Houdt2026-01-28下载Markov-modulated fluid queues (MMFQs) are a powerful modeling framework for analyzing the performance of computer and communication systems. Their distinguishing feature is that the underlying Markov ...

基于 VitePress 构建