Appearance
2026-01-28
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| REASON: Accelerating Probabilistic Logical Reasoning for Scalable Neuro-Symbolic Intelligence | Zishen Wan, Che-Kai Liu, Jiayi Qian, Hanchen Yang, Arijit Raychowdhury, Tushar Krishna | 2026-01-28 | 下载 | Neuro-symbolic AI systems integrate neural perception with symbolic reasoning to enable data-efficient, interpretable, and robust intelligence beyond purely neural models. |
| Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM Sampling | Binglei Lou, Haoran Wu, Yao Lai, Jiayi Nie, Can Xiao, Xuan Guo, Rika Antonova, Robert Mullins, Aaron Zhao | 2026-01-28 | 下载 | Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-ce... |
| GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering | Junseo Lee, Sangyun Jeon, Jungi Lee, Junyong Park, Jaewoong Sim | 2026-01-28 | 下载 | 3D Gaussian Splatting has gained widespread adoption across diverse applications due to its exceptional rendering performance and visual quality. |
| VersaQ-3D: A Reconfigurable Accelerator Enabling Feed-Forward and Generalizable 3D Reconstruction via Versatile Quantization | Yipu Zhang, Jintao Cheng, Xingyu Liu, Zeyu Li, Carol Jingyi Li, Jin Wu, Lin Jiang, Yuan Xie, Jiang Xu, Wei Zhang | 2026-01-28 | 下载 | The Visual Geometry Grounded Transformer (VGGT) enables strong feed-forward 3D reconstruction without per-scene optimization. However, its billion-parameter scale creates high memory and compute deman... |
| SATA: Sparsity-Aware Scheduling for Selective Token Attention | Zhenkun Fan, Zishen Wan, Che-Kai Liu, Ashwin Sanjay Lele, Win-San Khwa, Bo Zhang, Meng-Fan Chang, Arijit Raychowdhury | 2026-01-28 | 下载 | Transformers have become the foundation of numerous state-of-the-art AI models across diverse domains, thanks to their powerful attention mechanism for modeling long-range dependencies. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Deep Reinforcement Learning for Fault-Adaptive Routing in Eisenstein-Jacobi Interconnection Topologies | Mohammad Walid Charrwi, Zaid Hussain | 2026-01-28 | 下载 | The increasing density of many-core architectures necessitates interconnection networks that are both high-performance and fault-resilient. Eisenstein-Jacobi (EJ) networks, with their symmetric 6-regu... |
| Agentic Fog: A Policy-driven Framework for Distributed Intelligence in Fog Computing | Saeed Akbar, Muhammad Waqas, Rahmat Ullah | 2026-01-28 | 下载 | Fog and edge computing require adaptive control schemes that can handle partial observability, severe latency requirements, and dynamically changing workloads. |
| SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated Learning | Dawit Kiros Redie, Reza Arablouei, Stefan Werner | 2026-01-28 | 下载 | Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalle... |
| Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM Sampling | Binglei Lou, Haoran Wu, Yao Lai, Jiayi Nie, Can Xiao, Xuan Guo, Rika Antonova, Robert Mullins, Aaron Zhao | 2026-01-28 | 下载 | Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-ce... |
| OnePiece: A Large-Scale Distributed Inference System with RDMA for Complex AI-Generated Content (AIGC) Workflows | June Chen, Neal Xu, Gragas Huang, Bok Zhou, Stephen Liu | 2026-01-28 | 下载 | The rapid growth of AI-generated content (AIGC) has enabled high-quality creative production across diverse domains, yet existing systems face critical inefficiencies in throughput, resource utilizati... |
| Syncopate: Efficient Multi-GPU AI Kernels via Automatic Chunk-Centric Compute-Communication Overlap | Xinwei Qiang, Yue Guan, Zhengding Hu, Keren Zhou, Yufei Ding, Adnan Aziz | 2026-01-28 | 下载 | Communication has become a first-order bottleneck in large-cale GPU workloads, and existing distributed compilers address it mainly by overlapping whole compute and communication kernels at the stream... |
| Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads | Aleix Roca, Vicenç Beltran | 2026-01-28 | 下载 | The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads. |
| Meeting SLOs, Slashing Hours: Automated Enterprise LLM Optimization with OptiKIT | Nicholas Santavas, Kareem Eissa, Patrycja Cieplicka, Piotr Florek, Matteo Nulli, Stefan Vasilev, Seyyed Hadi Hashemi, Antonios Gasteratos, Shahram Khadivi | 2026-01-28 | 下载 | Enterprise LLM deployment faces a critical scalability challenge: organizations must optimize models systematically to scale AI initiatives within constrained compute budgets, yet the specialized expe... |
| Graph-Structured Deep Learning Framework for Multi-task Contention Identification with High-dimensional Metrics | Xiao Yang, Yinan Ni, Yuqi Tang, Zhimin Qiu, Chen Wang, Tingzhou Yuan | 2026-01-28 | 下载 | This study addresses the challenge of accurately identifying multi-task contention types in high-dimensional system environments and proposes a unified contention classification framework that integra... |
| LIFT: Byzantine Resilient Hub-Sampling | Mohamed Amine Legheraba, Nour Rachdi, Maria Gradinariu Potop-Butucaru, Sébastien Tixeuil | 2026-01-28 | 下载 | Recently, a novel peer sampling protocol, Elevator, was introduced to construct network topologies tailored for emerging decentralized applications such as federated learning and blockchain. |
| SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips | Jiahuan Yu, Mingtao Hu, Zichao Lin, Minjia Zhang | 2026-01-28 | 下载 | Large Language Model (LLM) serving faces a fundamental tension between stringent latency Service Level Objectives (SLOs) and limited GPU memory capacity. |
| StreamFusion: Scalable Sequence Parallelism for Distributed Inference of Diffusion Transformers on GPUs | Jiacheng Yang, Jun Wu, Yaoyao Ding, Zhiying Xu, Yida Wang, Gennady Pekhimenko | 2026-01-28 | 下载 | Diffusion Transformers (DiTs) have gained increasing adoption in high-quality image and video generation. As demand for higher-resolution images and longer videos increases, single-GPU inference becom... |
| Securing AI Agents in Cyber-Physical Systems: A Survey of Environmental Interactions, Deepfake Threats, and Defenses | Mohsen Hatami, Van Tuan Pham, Hozefa Lakadawala, Yu Chen | 2026-01-28 | 下载 | The increasing integration of AI agents into cyber-physical systems (CPS) introduces new security risks that extend beyond traditional cyber or physical threat models. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| An Explainable Failure Prediction Framework for Neural Networks in Radio Access Networks | Khaleda Papry, Francesco Spinnato, Marco Fiore, Mirco Nanni, Israat Haque | 2026-01-28 | 下载 | As 5G networks continue to evolve to deliver high speed, low latency, and reliable communications, ensuring uninterrupted service has become increasingly critical. |
| Immersive Volumetric Video Playback: Near-RT Resource Allocation and O-RAN-based Implementation | Yao Wen, Luping Xiang, Kun Yang | 2026-01-28 | 下载 | Immersive volumetric video streaming in extended reality (XR) demands ultra-low motion-to-photon (MTP) latency, which conventional edge-centric architectures struggle to meet due to per-frame computat... |
| Dependable Connectivity for Industrial Wireless Communication Networks | Nurul Huda Mahmood, Onel L. A. Lopez, David Ruiz-Guirola, Frank Burkhardt, Mehdi Rasti, Matti Latva-aho | 2026-01-28 | 下载 | Dependability - a system's ability to consistently provide reliable services by ensuring safety and maintainability in the face of internal or external disruptions - is a fundamental requirement for i... |
| IoT Device Identification with Machine Learning: Common Pitfalls and Best Practices | Kahraman Kostas, Rabia Yasa Kostas | 2026-01-28 | 下载 | This paper critically examines the device identification process using machine learning, addressing common pitfalls in existing literature. We analyze the trade-offs between identification methods (un... |
| Pocket RAG: On-Device RAG for First Aid Guidance in Offline Mobile Environment | Dong Ho Kang, Hyunjoon Lee, Hyeonjeong Cha, Minkyu Choi, Sungsoo Lim | 2026-01-28 | 下载 | In disaster scenarios or remote areas, first responders often lose network connectivity when providing first aid. In such situations, server-based AI systems fail to provide critical guidance. |
| Proactive SFC Provisioning with Forecast-Driven DRL in Data Centers | Parisa Fard Moshiri, Poonam Lohan, Burak Kantarci, Emil Janulewicz | 2026-01-28 | 下载 | Service Function Chaining (SFC) requires efficient placement of Virtual Network Functions (VNFs) to satisfy diverse service requirements while maintaining high resource utilization in Data Centers (DC... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Meta-ROS: A Next-Generation Middleware Architecture for Adaptive and Scalable Robotic Systems | Anshul Ranjan, Anoosh Damodar, Neha Chougule, Dhruva S Nayak, Anantharaman P. N, Shylaja S S | 2026-01-28 | 下载 | The field of robotics faces significant challenges related to the complexity and interoperability of existing middleware frameworks, like ROS2, which can be difficult for new developers to adopt. |
| /dev/SDB: Software Defined Boot -- A novel standard for diskless booting anywhere and everywhere | Aditya Mitra, Hamza Haroon, Amaan Rais Shah, Mohammad Elham Rasooli, Bogdan Itsam Dorantes Nikolaev, Tuğçe Ballı | 2026-01-28 | 下载 | A computer is nothing but a device that processes the instructions supplied to it. However, as computers evolved, the instructions or codes started to be more complicated. |
| Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads | Aleix Roca, Vicenç Beltran | 2026-01-28 | 下载 | The convergence of high-performance computing (HPC) and artificial intelligence (AI) is driving the emergence of increasingly complex parallel applications and workloads. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| The Multiserver-Job Stochastic Recurrence Equation for Cloud Computing Performance Evaluation | Francois Baccelli, Diletta Olliaro, Marco Ajmone Marsan, Andrea Marin | 2026-01-28 | 下载 | We study the Multiserver-Job Queuing Model (MJQM) with general independent arrivals and service times under FCFS scheduling, using stochastic recurrence equations (SREs) and ergodic theory. |
| Colored Markov Modulated Fluid Queues | Benny Van Houdt | 2026-01-28 | 下载 | Markov-modulated fluid queues (MMFQs) are a powerful modeling framework for analyzing the performance of computer and communication systems. Their distinguishing feature is that the underlying Markov ... |