Appearance
2026-03-01
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HAVEN: High-Bandwidth Flash Augmented Vector Engine for Large-Scale Approximate Nearest-Neighbor Search Acceleration | Po-Kai Hsu, Weihong Xu, Qunyou Liu, Tajana Rosing, Shimeng Yu | 2026-03-01 | 下载 | Retrieval-Augmented Generation (RAG) relies on large-scale Approximate Nearest Neighbor Search (ANNS) to retrieve semantically relevant context for large language models. |
| VIKIN: A Reconfigurable Accelerator for KANs and MLPs with Two-Stage Sparsity Support | Wenhui Ou, Zhuoyu Wu, Yipu Zhang, Zheng Wang, C. Patrick Yue | 2026-03-01 | 下载 | Recently, multi-layer perceptrons (MLPs) widely used in modern AI applications suffer from limited real-time performance due to intensive memory access overhead. |
| FLICKER: A Fine-Grained Contribution-Aware Accelerator for Real-Time 3D Gaussian Splatting | Wenhui Ou, Zhuoyu Wu, Yipu Zhang, Dongjun Wu, Freddy Ziyang Hong, Chik Patrick Yue | 2026-03-01 | 下载 | Recently, 3D Gaussian Splatting (3DGS) has emerged as a mainstream rendering technique due to its photorealistic quality and low latency. However, processing massive numbers of non-contributing Gaussi... |
| SHIELD8-UAV: Sequential 8-bit Hardware Implementation of a Precision-Aware 1D-F-CNN for Low-Energy UAV Acoustic Detection and Temporal Tracking | Susmita Ghanta, Karan Nathwani, Rohit Chaurasiya | 2026-03-01 | 下载 | Real-time unmanned aerial vehicle (UAV) acoustic detection at the edge demands low-latency inference under strict power and hardware limits. This paper presents SHIELD8-UAV, a sequential 8-bit hardwar... |
| TriMoE: Augmenting GPU with AMX-Enabled CPU and DIMM-NDP for High-Throughput MoE Inference via Offloading | Yudong Pan, Yintao He, Tianhua Han, Lian Liu, Shixin Zhao, Zhirong Chen, Mengdi Wang, Cangyuan Li, Yinhe Han, Ying Wang | 2026-03-01 | 下载 | To deploy large Mixture-of-Experts (MoE) models cost-effectively, offloading-based single-GPU heterogeneous inference is crucial. While GPU-CPU architectures that offload cold experts are constrained ... |
| SoberDSE: Sample-Efficient Design Space Exploration via Learning-Based Algorithm Selection | Lei Xu, Shanshan Wang, Chenglong Xiao | 2026-03-01 | 下载 | High-Level Synthesis (HLS) is a pivotal electronic design automation (EDA) technology that enables the generation of hardware circuits from high-level language descriptions. |
| Accelerating Multi-Scale Deformable Attention Using Near-Memory-Processing Architecture | Huize Li, Qinggang Wang, Bing Gao, Dan Chen, Yu Huang, Xin Xin | 2026-03-01 | 下载 | Multi Scale Deformable Attention (MSDAttn) has become a fundamental component in various vision tasks due to its effective multi scale grid sampling (MSGS). |
| Capstone: Power-Capped Pipelining for Coarse-Grained Reconfigurable Array Compilers | Sabrina Yarzada, Christopher Torng | 2026-03-01 | 下载 | Coarse-grained reconfigurable arrays (CGRAs) have attracted growing interest because they exhibit performance and energy efficiency competitive with ASICs while maintaining flexibility similar to FPGA... |
| Characterizing VLA Models: Identifying the Action Generation Bottleneck for Edge AI Architectures | Manoj Vishwanathan, Suvinay Subramanian, Anand Raghunathan | 2026-03-01 | 下载 | Vision-Language-Action (VLA) models are an emerging class of workloads critical for robotics and embodied AI at the edge. As these models scale, they demonstrate significant capability gains, yet they... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| The Finality Calculator: Analyzing and Quantifying Filecoin's Finality Guarantees | Guy Goren, Jorge M. Soares | 2026-03-01 | 下载 | In this paper, we analyze the finality of the Filecoin network, focusing on dynamic probabilistic guarantees of tipset permanence in the canonical chain. |
| A402: Binding Cryptocurrency Payments to Service Execution for Agentic Commerce | Yue Li, Lei Wang, Kaixuan Wang, Zhiqiang Yang, Ke Wang, Zhi Guan, Jianbo Gao | 2026-03-01 | 下载 | The rapid proliferation of autonomous AI agents is driving a shift toward agentic commerce, where agents are expected to autonomously invoke and pay for services. |
| Zipage: Maintain High Request Concurrency for LLM Reasoning through Compressed PagedAttention | Mengqi Liao, Lu Wang, Chaoyun Zhang, Bo Qiao, Si Qin, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Huaiyu Wan | 2026-03-01 | 下载 | With reasoning becoming the generative paradigm for large language models (LLMs), the memory bottleneck caused by KV cache during the decoding phase has become a critical factor limiting high-concurre... |
| TriMoE: Augmenting GPU with AMX-Enabled CPU and DIMM-NDP for High-Throughput MoE Inference via Offloading | Yudong Pan, Yintao He, Tianhua Han, Lian Liu, Shixin Zhao, Zhirong Chen, Mengdi Wang, Cangyuan Li, Yinhe Han, Ying Wang | 2026-03-01 | 下载 | To deploy large Mixture-of-Experts (MoE) models cost-effectively, offloading-based single-GPU heterogeneous inference is crucial. While GPU-CPU architectures that offload cold experts are constrained ... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Border Gateway Protocol Extension for Distributing Endpoint Identifier Reachability Information in Delay-tolerant Networks | Marius Feldmann, Théo Tchilinguirian, Felix Walter | 2026-03-01 | 下载 | The Delay-Tolerant Networking (DTN) community has created solid results during the last three decades. One aspect that still requires focus, however, is the simplification of configuring systems parti... |
| On Utility-optimal Entanglement Routing in Quantum Networks | Sounak Kar, Arpan Mukhopadhyay | 2026-03-01 | 下载 | Quantum networks are envisioned to enable reliable distribution and manipulation of quantum information across distances, forming the foundation of a future quantum internet. |
| Demand- and Priority-Aware Adaptive Congestion Control for Heterogeneous V2X Service Requirements | Miguel Sepulcre, Javier Tortosa-Garcia, Javier Gozalvez | 2026-03-01 | 下载 | Vehicle-to-Everything (V2X) communications enable the exchange of information among vehicles to improve road safety and traffic efficiency. As V2X deployments progress, vehicles are expected to suppor... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Characterizing VLA Models: Identifying the Action Generation Bottleneck for Edge AI Architectures | Manoj Vishwanathan, Suvinay Subramanian, Anand Raghunathan | 2026-03-01 | 下载 | Vision-Language-Action (VLA) models are an emerging class of workloads critical for robotics and embodied AI at the edge. As these models scale, they demonstrate significant capability gains, yet they... |