Appearance
2026-02-09
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ALPHA-PIM: Analysis of Linear Algebraic Processing for High-Performance Graph Applications on a Real Processing-In-Memory System | Marzieh Barkhordar, Alireza Tabatabaeian, Mohammad Sadrosadati, Christina Giannoula, Juan Gomez Luna, Izzat El Hajj, Onur Mutlu, Alaa R. Alameldeen | 2026-02-09 | 下载 | Processing large-scale graph datasets is computationally intensive and time-consuming. Processor-centric CPU and GPU architectures, commonly used for graph applications, often face bottlenecks caused ... |
| karl. - A Research Vehicle for Automated and Connected Driving | Jean-Pierre Busch, Lukas Ostendorf, Guido Linden, Lennart Reiher, Till Beemelmanns, Bastian Lampe, Timo Woopen, Lutz Eckstein | 2026-02-09 | 下载 | As highly automated driving is transitioning from single-vehicle closed-access testing to commercial deployments of public ride-hailing in selected areas (e.g. |
| Antiferromagnetic Tunnel Junctions (AFMTJs) for In-Memory Computing: Modeling and Case Study | Yousuf Choudhary, Tosiron Adegbija | 2026-02-09 | 下载 | Antiferromagnetic Tunnel Junctions (AFMTJs) enable picosecond switching and femtojoule writes through ultrafast sublattice dynamics. We present the first end-to-end AFMTJ simulation framework integrat... |
| ZipFlow: a Compiler-based Framework to Unleash Compressed Data Movement for Modern GPUs | Gwangoo Yeo, Zhiyang Shen, Wei Cui, Matteo Interlandi, Rathijit Sen, Bailu Ding, Qi Chen, Minsoo Rhu | 2026-02-09 | 下载 | In GPU-accelerated data analytics, the overhead of data transfer from CPU to GPU becomes a performance bottleneck when the data scales beyond GPU memory capacity due to the limited PCIe bandwidth. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Harvest: Adaptive Photonic Switching Schedules for Collective Communication in Scale-up Domains | Mahir Rahman, Samuel Joseph, Nihar Kodkani, Behnaz Arzani, Vamsi Addanki | 2026-02-09 | 下载 | As chip-to-chip silicon photonics gain traction for their bandwidth and energy efficiency, their circuit-switched nature raises a fundamental question for collective communication: when and how should... |
| ALPHA-PIM: Analysis of Linear Algebraic Processing for High-Performance Graph Applications on a Real Processing-In-Memory System | Marzieh Barkhordar, Alireza Tabatabaeian, Mohammad Sadrosadati, Christina Giannoula, Juan Gomez Luna, Izzat El Hajj, Onur Mutlu, Alaa R. Alameldeen | 2026-02-09 | 下载 | Processing large-scale graph datasets is computationally intensive and time-consuming. Processor-centric CPU and GPU architectures, commonly used for graph applications, often face bottlenecks caused ... |
| Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers | Jędrzej Maczan | 2026-02-09 | 下载 | WebGPU's security-focused design imposes per-operation validation that compounds across the many small dispatches in neural network inference, yet the true cost of this overhead is poorly characterize... |
| Distributed Hybrid Parallelism for Large Language Models: Comparative Study and System Design Guide | Hossam Amer, Rezaul Karim, Ali Pourranjbar, Weiwei Zhang, Walid Ahmed, Boxing Chen | 2026-02-09 | 下载 | With the rapid growth of large language models (LLMs), a wide range of methods have been developed to distribute computation and memory across hardware devices for efficient training and inference. |
| DynamiQ: Accelerating Gradient Synchronization using Compressed Multi-hop All-reduce | Wenchen Han, Shay Vargaftik, Michael Mitzenmacher, Ran Ben Basat | 2026-02-09 | 下载 | Multi-hop all-reduce is the de facto backbone of large model training. As the training scale increases, the network often becomes a bottleneck, motivating reducing the volume of transmitted data. |
| Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale | Kaiyang Zhao, Neha Gholkar, Hasan Maruf, Abhishek Dhanotia, Johannes Weiner, Gregory Price, Ning Sun, Bhavya Dwivedi, Stuart Clark, Dimitrios Skarlatos | 2026-02-09 | 下载 | Memory dominates datacenter system cost and power. Memory expansion via Compute Express Link (CXL) is an effective way to provide additional memory at lower cost and power, but its effective use requi... |
| PARD: Enhancing Goodput for Inference Pipeline via Proactive Request Dropping | Zhixin Zhao, Yitao Hu, Simin Chen, Mingfang Ji, Wei Yang, Yuhao Zhang, Laiping Zhao, Wenxin Li, Xiulong Liu, Wenyu Qu, Hao Wang | 2026-02-09 | 下载 | Modern deep neural network (DNN) applications integrate multiple DNN models into inference pipelines with stringent latency requirements for customized tasks. |
| RIFLE: Robust Distillation-based FL for Deep Model Deployment on Resource-Constrained IoT Networks | Pouria Arefijamal, Mahdi Ahmadlou, Bardia Safaei, Jörg Henkel | 2026-02-09 | 下载 | Federated learning (FL) is a decentralized learning paradigm widely adopted in resource-constrained Internet of Things (IoT) environments. These devices, typically relying on TinyML models, collaborat... |
| Mathematical Foundations of Modeling ETL Process Chains | Levin Maier, Lucas Schulze, Robert Lilow, Lukas Hahn, Nikola Krasowski, Arnulf Barth, Sebastian Gaebel, Ferdi Güran, Oliver Hanau, Giovanni Wagner, Falk Borgmann, Oleg Arenz, Jan Peters | 2026-02-09 | 下载 | Extract-Transform-Load (ETL) processes are core components of modern data processing infrastructures. The throughput of processed data records can be adjusted by changing the amount of allocated resou... |
| Modalities, a PyTorch-native Framework For Large-scale LLM Training and Research | Max Lübbering, Timm Ruland, Richard Rutmann, Felix Stollenwerk, David Fitzek, Michael Fromm, Alexander Weber, Rafet Sifa, Nicolas Flores-Herr, Joachim Köhler, Mehdi Ali | 2026-02-09 | 下载 | Today's LLM (pre-) training and research workflows typically allocate a significant amount of compute to large-scale ablation studies. Despite the substantial compute costs of these ablations, existin... |
| Towards CXL Resilience to CPU Failures | Antonis Psistakis, Burak Ocalan, Chloe Alverti, Fabien Chaix, Ramnatthan Alagappan, Josep Torrellas | 2026-02-09 | 下载 | Compute Express Link (CXL) 3.0 and beyond allows the compute nodes of a cluster to share data with hardware cache coherence and at the granularity of a cache line. |
| HEAL: Online Incremental Recovery for Leaderless Distributed Systems Across Persistency Models | Antonis Psistakis, Burak Ocalan, Fabien Chaix, Ramnatthan Alagappan, Josep Torrellas | 2026-02-09 | 下载 | Ensuring resilience in distributed systems has become an acute concern. In today's environment, it is crucial to develop light-weight mechanisms that recover a distributed system from faults quickly a... |
| The Computer System Trail | Sushant Kumar Gupta | 2026-02-09 | 下载 | No matter how much the world of computing changes, system design remains crucial. While most people try to learn it through quick tutorials or AI-generated summaries, there is no better way to master ... |
| Fork, Explore, Commit: OS Primitives for Agentic Exploration | Cong Wang, Yusheng Zheng | 2026-02-09 | 下载 | AI agents increasingly perform agentic exploration: pursuing multiple solution paths in parallel and committing only the successful one. Because each exploration path may modify files and spawn proces... |
| ZipFlow: a Compiler-based Framework to Unleash Compressed Data Movement for Modern GPUs | Gwangoo Yeo, Zhiyang Shen, Wei Cui, Matteo Interlandi, Rathijit Sen, Bailu Ding, Qi Chen, Minsoo Rhu | 2026-02-09 | 下载 | In GPU-accelerated data analytics, the overhead of data transfer from CPU to GPU becomes a performance bottleneck when the data scales beyond GPU memory capacity due to the limited PCIe bandwidth. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Harvest: Adaptive Photonic Switching Schedules for Collective Communication in Scale-up Domains | Mahir Rahman, Samuel Joseph, Nihar Kodkani, Behnaz Arzani, Vamsi Addanki | 2026-02-09 | 下载 | As chip-to-chip silicon photonics gain traction for their bandwidth and energy efficiency, their circuit-switched nature raises a fundamental question for collective communication: when and how should... |
| Probabilistic Fair Ordering of Events | Muhammad Haseeb, Jinkun Geng, Aurojit Panda, Radhika Mittal, Nirav Atre, Srinivas Narayana, Anirudh Sivaraman | 2026-02-09 | 下载 | A growing class of applications depends on fair ordering, where events that occur earlier should be processed before later ones. Providing such guarantees is difficult in practice because clock synchr... |
| Zero Trust for Multi-RAT IoT: Trust Boundary Management in Heterogeneous Wireless Network Environments | Jonathan Shelby | 2026-02-09 | 下载 | The proliferation of Multi-Radio Access Technology, Internet of Things devices, particularly Unmanned Aerial Vehicles operating across LoRaWAN, 5G/4G cellular, Meshtastic mesh, proprietary protocols s... |
| Lightweight Call Signaling and Peer-to-Peer Control of WebRTC Video Conferencing | Kundan Singh | 2026-02-09 | 下载 | We present the software architecture and implementation of our web-based multiparty video conference application. It does not use a media server. |
| DynamiQ: Accelerating Gradient Synchronization using Compressed Multi-hop All-reduce | Wenchen Han, Shay Vargaftik, Michael Mitzenmacher, Ran Ben Basat | 2026-02-09 | 下载 | Multi-hop all-reduce is the de facto backbone of large model training. As the training scale increases, the network often becomes a bottleneck, motivating reducing the volume of transmitted data. |
| Framework for Integrating Zero Trust in Cloud-Based Endpoint Security for Critical Infrastructure | Shyam Kumar Gajula | 2026-02-09 | 下载 | Cyber threats have become highly sophisticated, prompting a heightened concern for endpoint security, especially in critical infrastructure, to new heights. |
| Rethinking IPv6 Defense: A Unified Edge-Centric Zero-Trust Data-Plane Architecture | Walid Aljoby, Mohammed Alzayani, Md. Kamrul Hossain, Khaled A. Harras | 2026-02-09 | 下载 | IPv6 dependability is increasingly inseparable from IPv6 security: Neighbor Discovery (ND), Router Advertisements (RA), and ICMPv6 are essential for correct operation yet expose a broad attack surface... |
| 6G-Bench: An Open Benchmark for Semantic Communication and Network-Level Reasoning with Foundation Models in AI-Native 6G Networks | Mohamed Amine Ferrag, Abderrahmane Lakas, Merouane Debbah | 2026-02-09 | 下载 | This paper introduces 6G-Bench, an open benchmark for evaluating semantic communication and network-level reasoning in AI-native 6G networks. 6G-Bench defines a taxonomy of 30 decision-making tasks (T... |
| Equitable Multi-Task Learning for AI-RANs | Panayiotis Raptis, Fatih Aslan, George Iosifidis | 2026-02-09 | 下载 | AI-enabled Radio Access Networks (AI-RANs) are expected to serve heterogeneous users with time-varying learning tasks over shared edge resources. |
| From Raw Data to Shared 3D Semantics: Task-Oriented Communication for Multi-Robot Collaboration | Ruibo Xue, Jiedan Tan, Fang Liu, Jingwen Tong, Taotao Wang, Shuoyao Wang | 2026-02-09 | 下载 | Multi-robot systems (MRS) rely on exchanging raw sensory data to cooperate in complex three-dimensional (3D) environments. However, this strategy often leads to severe communication congestion and hig... |
| Decentralized Spatial Reuse Optimization in Wi-Fi: An Internal Regret Minimization Approach | Francesc Wilhelmi, Boris Bellalta, Miguel Casasnovas, Aleksandra Kijanka, Miguel Calvo-Fullana | 2026-02-09 | 下载 | Spatial Reuse (SR) is a cost-effective technique for improving spectral efficiency in dense IEEE 802.11 deployments by enabling simultaneous transmissions. |
| RIFLE: Robust Distillation-based FL for Deep Model Deployment on Resource-Constrained IoT Networks | Pouria Arefijamal, Mahdi Ahmadlou, Bardia Safaei, Jörg Henkel | 2026-02-09 | 下载 | Federated learning (FL) is a decentralized learning paradigm widely adopted in resource-constrained Internet of Things (IoT) environments. These devices, typically relying on TinyML models, collaborat... |
| PACC: Protocol-Aware Cross-Layer Compression for Compact Network Traffic Representation | Zhaochen Guo, Tianyufei Zhou, Honghao Wang, Ronghua Li, Shinan Liu | 2026-02-09 | 下载 | Network traffic classification is a core primitive for network security and management, yet it is increasingly challenged by pervasive encryption and evolving protocols. |
| MonkeyTree: Near-Minimal Congestion for Multi-tenant Training via Migration | Anton A. Zabreyko, Weiyang Wang, Manya Ghobadi | 2026-02-09 | 下载 | We present MonkeyTree, the first system to mitigate network congestion in multi-tenant GPU clusters through job-migration based defragmentation rather than network-layer techniques. |
| Software Testing at the Network Layer: Automated HTTP API Quality Assessment and Security Analysis of Production Web Applications | Ali Hassaan Mughal, Muhammad Bilal, Noor Fatima | 2026-02-09 | 下载 | Modern web applications rely heavily on client-side API calls to fetch data, render content, and communicate with backend services. However, the quality of these network interactions (redundant reques... |
| NeuroScaler: Towards Energy-Optimal Autoscaling for Container-Based Services | Alisson O. Chaves, Rodrigo Moreira, Larissa F. Rodrigues Moreira, Joao Correia, David Santos, Rui Silva, Tiago Barros, Daniel Corujo, Miguel Rocha, Flavio de Oliveira Silva | 2026-02-09 | 下载 | Future networks must meet stringent requirements while operating within tight energy and carbon constraints. Current autoscaling mechanisms remain workload-centric and infrastructure-siloed, and are l... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale | Kaiyang Zhao, Neha Gholkar, Hasan Maruf, Abhishek Dhanotia, Johannes Weiner, Gregory Price, Ning Sun, Bhavya Dwivedi, Stuart Clark, Dimitrios Skarlatos | 2026-02-09 | 下载 | Memory dominates datacenter system cost and power. Memory expansion via Compute Express Link (CXL) is an effective way to provide additional memory at lower cost and power, but its effective use requi... |
| The Computer System Trail | Sushant Kumar Gupta | 2026-02-09 | 下载 | No matter how much the world of computing changes, system design remains crucial. While most people try to learn it through quick tutorials or AI-generated summaries, there is no better way to master ... |
| Fork, Explore, Commit: OS Primitives for Agentic Exploration | Cong Wang, Yusheng Zheng | 2026-02-09 | 下载 | AI agents increasingly perform agentic exploration: pursuing multiple solution paths in parallel and committing only the successful one. Because each exploration path may modify files and spawn proces... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers | Jędrzej Maczan | 2026-02-09 | 下载 | WebGPU's security-focused design imposes per-operation validation that compounds across the many small dispatches in neural network inference, yet the true cost of this overhead is poorly characterize... |
| A Machine Learning accelerated geophysical fluid solver | Yang Bai | 2026-02-09 | 下载 | Machine learning methods have been successful in many areas, like image classification and natural language processing. However, it still needs to be determined how to apply ML to areas with mathemati... |