Appearance
2026-02-10
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A | Aaron Jarmusch, Connor Vitz, Sunita Chandrasekaran | 2026-02-10 | 下载 | The AMD MI300A APU integrates CDNA3 GPUs with high-bandwidth memory and advanced accelerator features: FP8 matrix cores, asynchronous compute engines (ACE), and 2:4 structured sparsity. |
| Area-Efficient In-Memory Computing for Mixture-of-Experts via Multiplexing and Caching | Hanyuan Gao, Xiaoxuan Yang | 2026-02-10 | 下载 | Mixture-of-Experts (MoE) layers activate a subset of model weights, dubbed experts, to improve model performance. MoE is particularly promising for deployment on process-in-memory (PIM) architectures,... |
| ACE-RTL: When Agentic Context Evolution Meets RTL-Specialized LLMs | Chenhui Deng, Zhongzhi Yu, Guan-Ting Liu, Nathaniel Pinckney, Brucek Khailany, Haoxing Ren | 2026-02-10 | 下载 | Recent advances in LLMs have sparked growing interest in applying them to hardware design automation, particularly for accurate RTL code generation. |
| KernelCraft: Benchmarking for Agentic Close-to-Metal Kernel Generation on Emerging Hardware | Jiayi Nie, Haoran Wu, Yao Lai, Zeyu Cao, Cheng Zhang, Binglei Lou, Erwei Wang, Jianyi Cheng, Timothy M. Jones, Robert Mullins, Rika Antonova, Yiren Zhao | 2026-02-10 | 下载 | New AI accelerators with novel instruction set architectures (ISAs) often require developers to manually craft low-level kernels -- a time-consuming, laborious, and error-prone process that cannot sca... |
| Development of an Energy-Efficient and Real-Time Data Movement Strategy for Next-Generation Heterogeneous Mixed-Criticality Systems | Thomas Benz | 2026-02-10 | 下载 | Industrial domains such as automotive, robotics, and aerospace are rapidly evolving to satisfy the increasing demand for machine-learning-driven Autonomy, Connectivity, Electrification, and Shared mob... |
| AnalogToBi: Device-Level Analog Circuit Topology Generation via Bipartite Graph and Grammar Guided Decoding | Seungmin Kim, Mingun Kim, Yuna Lee, Yulhwa Kim | 2026-02-10 | 下载 | Automatic generation of device-level analog circuit topologies remains a fundamental challenge in analog design automation. Recent transformer-based approaches have shown promise, yet they often suffe... |
| SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation | Mu-Chi Chen, Yu-Hung Kao, Po-Hsuan Huang, Shao-Chun Ho, Hsiang-Yu Tsou, I-Ting Wu, En-Ming Huang, Yu-Kai Hung, Wei-Po Hsin, Cheng Liang, Chia-Heng Tu, Shih-Hao Hung, H. T. Kung | 2026-02-10 | 下载 | Large language models (LLMs) have recently emerged as a promising approach for automating Verilog code generation; however, existing methods primarily emphasize syntactic correctness and often rely on... |
| Accelerating Post-Quantum Cryptography via LLM-Driven Hardware-Software Co-Design | Yuchao Liao, Tosiron Adegbija, Roman Lysecky | 2026-02-10 | 下载 | Post-quantum cryptography (PQC) is crucial for securing data against emerging quantum threats. However, its algorithms are computationally complex and difficult to implement efficiently on hardware. |
| FHECore: Rethinking GPU Microarchitecture for Fully Homomorphic Encryption | Lohit Daksha, Seyda Guzelhan, Kaustubh Shivdikar, Carlos Agulló Domingo, Óscar Vera Lopez, Gilbert Jonatan, Hubert Dymarkowski, Aymane El Jerari, José Cano, José L. Abellán, John Kim, David Kaeli, Ajay Joshi | 2026-02-10 | 下载 | Fully Homomorphic Encryption (FHE) enables computation directly on encrypted data but incurs massive computational and memory overheads, often exceeding plaintext execution by several orders of magnit... |
| CktEvo: Repository-Level RTL Code Benchmark for Design Evolution | Zhengyuan Shi, Jingxin Wang, Tairan Cheng, Changran Xu, Weikang Qian, Qiang Xu | 2026-02-10 | 下载 | Register-Transfer Level (RTL) coding is an iterative, repository-scale process in which Power, Performance, and Area (PPA) emerge from interactions across many files and the downstream toolchain. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Flash-SD-KDE: Accelerating SD-KDE with Tensor Cores | Elliot L. Epstein, Rajat Vadiraj Dwaraknath, John Winnicki | 2026-02-10 | 下载 | Score-debiased kernel density estimation (SD-KDE) achieves improved asymptotic convergence rates over classical KDE, but its use of an empirical score has made it significantly slower in practice. |
| Implementability of Global Distributed Protocols modulo Network Architectures | Elaine Li, Thomas Wies | 2026-02-10 | 下载 | Global protocols specify distributed, message-passing protocols from a birds-eye view, and are used as a specification for synthesizing local implementations. |
| Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A | Aaron Jarmusch, Connor Vitz, Sunita Chandrasekaran | 2026-02-10 | 下载 | The AMD MI300A APU integrates CDNA3 GPUs with high-bandwidth memory and advanced accelerator features: FP8 matrix cores, asynchronous compute engines (ACE), and 2:4 structured sparsity. |
| KORAL: Knowledge Graph Guided LLM Reasoning for SSD Operational Analysis | Mayur Akewar, Sandeep Madireddy, Dongsheng Luo, Janki Bhimani | 2026-02-10 | 下载 | Solid State Drives (SSDs) are critical to datacenters, consumer platforms, and mission-critical systems. Yet diagnosing their performance and reliability is difficult because data are fragmented and t... |
| Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis? | Taeyoon Kim, Woohyeok Park, Hoyeong Yun, Kyungyong Lee | 2026-02-10 | 下载 | Failures in large-scale cloud systems incur substantial financial losses, making automated Root Cause Analysis (RCA) essential for operational stability. |
| Efficient Remote Prefix Fetching with GPU-native Media ASICs | Liang Mi, Weijun Wang, Jinghan Chen, Ting Cao, Haipeng Dai, Yunxin Liu | 2026-02-10 | 下载 | Remote KV cache reuse fetches KV cache for identical contexts from remote storage, avoiding recomputation, accelerating LLM inference. While it excels in high-speed networks, its performance degrades ... |
| Revealing the Challenges of Attention-FFN Disaggregation for Modern MoE Models and Hardware Systems | Guowei Liu, Hongming Li, Yaning Guo, Yongxi Lyu, Mo Zhou, Yi Liu, Zhaogeng Li, Yanpeng Wang | 2026-02-10 | 下载 | Deploying large-scale MoE models presents challenges in memory capacity and bandwidth for expert activation. While Attention-FFN Disaggregation (AFD) has emerged as a potential architecture to decoupl... |
| High-performance Vector-length Agnostic Quantum Circuit Simulations on ARM Processors | Ruimin Shi, Gabin Schieffer, Pei-Hung Lin, Maya Gokhale, Andreas Herten, Ivy Peng | 2026-02-10 | 下载 | ARM SVE and RISC-V RVV are emerging vector architectures in high-end processors that support vectorization of flexible vector length. In this work, we leverage an important workload for quantum comput... |
| Rashomon Sets and Model Multiplicity in Federated Learning | Xenia Heilmann, Luca Corbucci, Mattia Cerrato | 2026-02-10 | 下载 | The Rashomon set captures the collection of models that achieve near-identical empirical performance yet may differ substantially in their decision boundaries. |
| It's not a lie if you don't get caught: simplifying reconfiguration in SMR through dirty logs | Allen Clement, Natacha Crooks, Neil Giridharan, Alex Shamis | 2026-02-10 | 下载 | Production state-machine replication (SMR) implementations are complex, multi-layered architectures comprising data dissemination, ordering, execution, and reconfiguration components. |
| The Coordination Criterion | Joseph M. Hellerstein | 2026-02-10 | 下载 | When is coordination intrinsically required by a distributed specification, rather than imposed by a particular protocol or implementation strategy? We give a general answer using minimal assumptions. |
| Architectural Foundations for Checkpointing and Restoration in Quantum HPC Systems | Qiang Guan, Qinglei Cao, Xiaoyi Lu, Siyuan Niu | 2026-02-10 | 下载 | In this work, we explore the design of the checkpointing and restoration for quantum HPC that leverages dynamic circuit technology to enable restartable and resilient quantum execution. |
| LLM-CoOpt: A Co-Design and Optimization Framework for Efficient LLM Inference on Heterogeneous Platforms | Jie Kong, Wei Wang, Jiehan Zhou, Chen Yu | 2026-02-10 | 下载 | Major challenges in LLMs inference remain frequent memory bandwidth bottlenecks, computational redundancy, and inefficiencies in long-sequence processing. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Bring Your Own Objective: Inter-operability of Network Objectives in Datacenters | Sanjoli Narang, Anup Agarwal, Venkat Arun, Manya Ghobadi | 2026-02-10 | 下载 | Datacenter networks are currently locked in a "tyranny of the single objective". While modern workloads demand diverse performance goals, ranging from coflow completion times, per-flow fairness, short... |
| Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure | Chuan-Chi Lai | 2026-02-10 | 下载 | Ensuring continuous service coverage under unexpected hardware failures is a fundamental challenge for 3D Aerial-Ground Integrated Networks. Although Multi-Agent Reinforcement Learning facilitates aut... |
| ORCHID: Fairness-Aware Orchestration in Mission-Critical Air-Ground Integrated Networks | Chuan-Chi Lai, Chi Jai Choy | 2026-02-10 | 下载 | In the era of 6G Air-Ground Integrated Networks (AGINs), Unmanned Aerial Vehicles (UAVs) are pivotal for providing on-demand wireless coverage in mission-critical environments, such as post-disaster r... |
| SCOPE: A Training-Free Online 3D Deployment for UAV-BSs with Theoretical Analysis and Comparative Study | Chuan-Chi Lai | 2026-02-10 | 下载 | Unmanned Aerial Vehicle (UAV)-mounted Base Stations (UAV-BSs) offer a flexible solution for serving ground users in temporary hotspot scenarios. |
| Tracing Data Packet Paths over the Internet using Traceroute | Thomas Dreibholz, Somnath Mazumdar | 2026-02-10 | 下载 | Network communication using the Internet Protocol (IP) is a pillar of modern Internet applications. IP allows data packets to travel the world through a complex set of interconnected computer networks... |
| Hybrid Responsible AI-Stochastic Approach for SLA Compliance in Multivendor 6G Networks | Emanuel Figetakis, Ahmed Refaey Hussein | 2026-02-10 | 下载 | The convergence of AI and 6G network automation introduces new challenges in maintaining transparency, fairness, and accountability across multivendor management systems. |
| 6G NTN Waveforms: A Comparison of OTFS, AFDM and OCDM in LEO Satellite Channels | Baidyanath Mandal, Aniruddha Chandra, Rastislav Roka, Jarosław Wojtun, Jan Kelner, Cezary Ziołkowski | 2026-02-10 | 下载 | Sixth generation (6G) physical layer (PHY) is evolving beyond the legacy orthogonal frequency division multiplexing (OFDM)-based waveforms. In this paper, we compare the bit error rate (BER) performan... |
| Optimally Deployed Multistatic OTFS-ISAC Design With Kalman-Based Tracking of Targets | Jyotsna Rani, Kuntal Deka, Ganesh Prasad, Zilong Liu | 2026-02-10 | 下载 | This paper studies orthogonal time-frequency space (OTFS) modulation aided multistatic integrated sensing and communication (ISAC) in vehicular networks, whereby its delay-Doppler robustness is exploi... |
| Semantic Waveforms for AI-Native 6G Networks | Nour Hello, Mohamed Amine Hamoura, Francois Rivet, Emilio Calvanese Strinati | 2026-02-10 | 下载 | In this paper, we propose a semantic-aware waveform design framework for AI-native 6G networks that jointly optimizes physical layer resource usage and semantic communication efficiency and robustness... |
| ISO FastLane: Faster ISO 11783 with Dual Stack Approach as a Short Term Solution | Timo Oksanen | 2026-02-10 | 下载 | The agricultural industry has been searching for a high-speed successor to the 250~kbit/s CAN bus backbone of ISO~11783 (ISOBUS) for over a decade, yet no protocol-level solution has reached standardi... |
| Fidelity-Age-Aware Scheduling in Quantum Repeater Networks | Ozgur Ercetin, Zafer Gedik | 2026-02-10 | 下载 | Quantum repeater networks distribute entanglement over long distances but must balance fidelity, delay, and resource contention. Prior work optimized throughput and end-to-end fidelity, yet little att... |
| QoS Identifier and Slice Mapping in 5G and Non-Terrestrial Network Interconnected Systems | Yuma Abe, Mariko Sekiguchi, Amane Miura | 2026-02-10 | 下载 | The interconnection of 5G and non-terrestrial networks (NTNs) has been actively studied to expand connectivity beyond conventional terrestrial infrastructure. |
| XLB: A High Performance Layer-7 Load Balancer for Microservices using eBPF-based In-kernel Interposition | Yuejie Wang, Chenchen Shou, Jiaxu Qian, Guyue Liu | 2026-02-10 | 下载 | L7 load balancers are a fundamental building block in microservices as they enable fine-grained traffic distribution. Compared to monolithic applications, microservices demand higher performance and s... |
| Resilient and Freshness-Aware Scheduling for Industrial Multi-Hop IAB Networks: A Packet Duplication Approach | Shuo Zhu, Siyu Lin, Zijing Wang, Qiao Ren, Xiaoheng Deng, Bo Ai | 2026-02-10 | 下载 | In industrial millimeter-wave (mmWave) multi-hop Integrated Access and Backhaul (IAB) networks, dynamic blockages caused by moving obstacles pose a severe threat to robust and continuous networks. |
| MalMoE: Mixture-of-Experts Enhanced Encrypted Malicious Traffic Detection Under Graph Drift | Yunpeng Tan, Qingyang Li, Mingxin Yang, Yannan Hu, Lei Zhang, Xinggong Zhang | 2026-02-10 | 下载 | Encryption has been commonly used in network traffic to secure transmission, but it also brings challenges for malicious traffic detection, due to the invisibility of the packet payload. |
| XMap: Fast Internet-wide IPv4 and IPv6 Network Scanner | Xiang Li, Zixuan Xie, Lu Sun, Yuqi Qiu, Zuyao Xu, Zheli Liu | 2026-02-10 | 下载 | XMap is an open-source network scanner designed for performing fast Internet-wide IPv4 and IPv6 network research scanning. XMap was initially developed as the research artifact of a paper published at... |
| Cooperative Edge Caching with Large Language Model in Wireless Networks | Ning Yang, Wentao Wang, Lingtao Ouyang, Haijun Zhang | 2026-02-10 | 下载 | Cooperative edge caching in overlapping zones couples Base Station (BS) decisions, making content replacement sensitive to spatial topology and temporal reuse. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| AgentCgroup: Understanding and Controlling OS Resources of AI Agents | Yusheng Zheng, Jiakun Fan, Quanzhi Fu, Yiwei Yang, Wei Zhang, Andi Quinn | 2026-02-10 | 下载 | AI agents are increasingly deployed in multi-tenant cloud environments, where they execute diverse tool calls within sandboxed containers, each call with distinct resource demands and rapid fluctuatio... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| XLB: A High Performance Layer-7 Load Balancer for Microservices using eBPF-based In-kernel Interposition | Yuejie Wang, Chenchen Shou, Jiaxu Qian, Guyue Liu | 2026-02-10 | 下载 | L7 load balancers are a fundamental building block in microservices as they enable fine-grained traffic distribution. Compared to monolithic applications, microservices demand higher performance and s... |