Skip to content

2026-03-10

cs.AR - Architecture

标题作者发布日期PDF摘要
Unifying Logical and Physical Layout Representations via Heterogeneous Graphs for Circuit Congestion PredictionRunbang Hu, Bo Fang, Bingzhe Li, Yuede Ji2026-03-10下载As Very Large Scale Integration (VLSI) designs continue to scale in size and complexity, layout verification has become a central challenge in modern Electronic Design Automation (EDA) workflows.
Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNsVishal Shashidhar, Anupam Kumari, Roy P Paily2026-03-10下载Modern CNNs' high computational demands hinder edge deployment, as traditional ``hard'' sparsity (skipping mathematical zeros) loses effectiveness in deep layers or with smooth activations like Tanh.
Pooling Engram Conditional Memory in Large Language Models using CXLRuiyang Ma, Teng Ma, Zhiyuan Su, Hantian Zha, Xinpeng Zhao, Xuchun Shang, Xingrui Yi, Zheng Liu, Zhu Cao, An Wu, Zhichong Dou, Ziqian Liu, Daikang Kuang, Guojie Luo2026-03-10下载Engram conditional memory has emerged as a promising component for LLMs by decoupling static knowledge lookup from dynamic computation. Since Engram exhibits sparse access patterns and supports prefet...
Nemo: A Low-Write-Amplification Cache for Tiny Objects on Log-Structured Flash DevicesXufeng Yang, Tingting Tan, Jingxin Hu, Congming Gao, Mingyang Liu, Tianyang Jiang, Jian Chen, Linbo Long, Yina Lv, Jiwu Shu2026-03-10下载Modern storage systems predominantly use flash-based SSDs as a cache layer due to their favorable performance and cost efficiency. However, in tiny-object workloads, existing flash cache designs still...
TrainDeeploy: Hardware-Accelerated Parameter-Efficient Fine-Tuning of Small Transformer Models at the Extreme EdgeRun Wang, Victor J. B. Jung, Philip Wiese, Francesco Conti, Alessio Burrello, Luca Benini2026-03-10下载On-device tuning of deep neural networks enables long-term adaptation at the edge while preserving data privacy. However, the high computational and memory demands of backpropagation pose significant ...
DendroNN: Dendrocentric Neural Networks for Energy-Efficient Classification of Event-Based DataJann Krausse, Zhe Su, Kyrus Mama, Maryada, Klaus Knobloch, Giacomo Indiveri, Jürgen Becker2026-03-10下载Spatiotemporal information is at the core of diverse sensory processing and computational tasks. Feed-forward spiking neural networks can be used to solve these tasks while offering potential benefits...
Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTLSiyang Cai, Cangyuan Li, Yinhe Han, Ying Wang2026-03-10下载Learning effective netlist representations is fundamentally constrained by the scarcity of labeled datasets, as real designs are protected by Intellectual Property (IP) and costly to annotate.
Two Teachers Better Than One: Hardware-Physics Co-Guided Distributed Scientific Machine LearningYuchen Yuan, Junhuan Yang, Hao Wan, Yipei Liu, Hanhan Wu, Youzuo Lin, Lei Yang2026-03-10下载Scientific machine learning (SciML) is increasingly applied to in-field processing, controlling, and monitoring; however, wide-area sensing, real-time demands, and strict energy and reliability constr...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
ACE Runtime - A ZKP-Native Blockchain Runtime with Sub-Second Cryptographic FinalityJian Sheng Wang2026-03-10下载Existing high performance blockchains verify one signature per transaction on the critical path, which creates O(N) verification cost, high hardware pressure, and difficult post quantum migration.
The Bureaucracy of Speed: Structural Equivalence Between Memory Consistency Models and Multi-Agent Authorization RevocationVladyslav Parakhin2026-03-10下载The temporal assumptions underpinning conventional Identity and Access Management collapse under agentic execution regimes. A sixty-second revocation window permits on the order of 6×1036 \times 10^3 una...
Rate-Distortion Bounds for Heterogeneous Random Fields on Finite LatticesSujata Sinha, Vishwas Rao, Robert Underwood, David Lenz, Sheng Di, Franck Cappello, Lingjia Liu2026-03-10下载Since Shannon's foundational work, rate-distortion theory has defined the fundamental limits of lossy compression. Classical results, derived for memoryless and stationary ergodic sources in the asymp...
Ensuring Data Freshness in Multi-Rate Task Chains SchedulingJosé Luis Conradi Hoffmann, Antônio Augusto Fröhlich2026-03-10下载In safety-critical autonomous systems, data freshness presents a fundamental design challenge. While the Logical Execution Time (LET) paradigm ensures compositional determinism, it often does so at th...
Pooling Engram Conditional Memory in Large Language Models using CXLRuiyang Ma, Teng Ma, Zhiyuan Su, Hantian Zha, Xinpeng Zhao, Xuchun Shang, Xingrui Yi, Zheng Liu, Zhu Cao, An Wu, Zhichong Dou, Ziqian Liu, Daikang Kuang, Guojie Luo2026-03-10下载Engram conditional memory has emerged as a promising component for LLMs by decoupling static knowledge lookup from dynamic computation. Since Engram exhibits sparse access patterns and supports prefet...
Multi-DNN Inference of Sparse Models on Edge SoCsJiawei Luo, Di Wu, Simon Dobson, Blesson Varghese2026-03-10下载Modern edge applications increasingly require multi-DNN inference systems to execute tasks on heterogeneous processors, gaining performance from both concurrent execution and from matching each model ...
Randomized Distributed Function Computation (RDFC): Ultra-Efficient Semantic Communication Applications to PrivacyOnur Günlü2026-03-10下载We establish the randomized distributed function computation (RDFC) framework, in which a sender transmits just enough information for a receiver to generate a randomized function of the input data.
Case Study: Performance Analysis of a Virtualized XRootD Frontend in Large-Scale WAN TransfersJ M da Silva, M A Costa, R L Iope2026-03-10下载This paper presents a detailed case study of the T2_BR_SPRACE storage frontend architecture and its observed performance in high-intensity data transfers.
Compiler-First State Space Duality and Portable O(1)O(1) Autoregressive Caching for InferenceCosmo Santoni2026-03-10下载State-space model releases are typically coupled to fused CUDA and Triton kernels, inheriting a hard dependency on NVIDIA hardware. We show that Mamba-2's state space duality algorithm -- diagonal sta...
Flash-KMeans: Fast and Memory-Efficient Exact K-MeansShuo Yang, Haocheng Xi, Yilong Zhao, Muyang Li, Xiaoze Fan, Jintao Zhang, Han Cai, Yujun Lin, Xiuyu Li, Kurt Keutzer, Song Han, Chenfeng Xu, Ion Stoica2026-03-10下载kk-means has historically been positioned primarily as an offline processing primitive, typically used for dataset organization or embedding preprocessing rather than as a first-class component in on...
PIM-SHERPA: Software Method for On-device LLM Inference by Resolving PIM Memory Attribute and Layout InconsistenciesSunjung Lee, Sanghoon Cha, Hyeonsu Kim, Seungwoo Seo, Yuhwan Ro, Sukhan Lee, Byeongho Kim, Yongjun Park, Kyomin Sohn, Seungwon Lee, Jaehoon Yu2026-03-10下载On-device deployments of large language models (LLMs) are rapidly proliferating across mobile and edge platforms. LLM inference comprises a compute-intensive prefill phase and a memory bandwidth-inten...
Hierarchical Observe-Orient-Decide-Act Enabled UAV Swarms in Uncertain Environments: Frameworks, Potentials, and ChallengesZiye Jia, Yao Wu, Qihui Wu, Lijun He, Qiuming Zhu, Fuhui Zhou, Zhu Han2026-03-10下载Unmanned aerial vehicle (UAV) swarms are increasingly explored for their potentials in various applications such as surveillance, disaster response, and military.
Nezha: A Key-Value Separated Distributed Store with Optimized Raft IntegrationYangyang Wang, Yucong Dong, Ziqian Cheng, Zichen Xu2026-03-10下载Distributed key-value stores are widely adopted to support elastic big data applications, leveraging purpose-built consensus algorithms like Raft to ensure data consistency.
Accelerating High-Order Finite Element Simulations at Extreme Scale with FP64 Tensor CoresJiqun Tu, Ian Karlin, John Camier, Veselin Dobrev, Tzanio Kolev, Stefan Henneking, Omar Ghattas2026-03-10下载Finite element simulations play a critical role in a wide range of applications, from automotive design to tsunami modeling and computational electromagnetics.
Two Teachers Better Than One: Hardware-Physics Co-Guided Distributed Scientific Machine LearningYuchen Yuan, Junhuan Yang, Hao Wan, Yipei Liu, Hanhan Wu, Youzuo Lin, Lei Yang2026-03-10下载Scientific machine learning (SciML) is increasingly applied to in-field processing, controlling, and monitoring; however, wide-area sensing, real-time demands, and strict energy and reliability constr...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Fly-PRAC: Packet Recovery for Random Linear Network CodingHosein K. Nazari, Stefan Senk, Peyman Pahlevani, Juan A. Cabrera, Frank H. P. Fitzek2026-03-10下载Network Coding (NC) is a compelling solution for increasing network efficiency. However, it discards corrupted packets and cannot achieve optimal performance in noisy communications.
Performance Evaluation of Delay Tolerant Network Protocols to Improve Nepal Earthquake Rescue CommunicationsXiaofei Liu, Milena Radenkovic2026-03-10下载In the fields of disaster rescue and communication in extreme environments, Delay Tolerant Network (DTN) has become an important technology due to its "store-carry-forward" mechanism.
Towards Flexible Spectrum Access: Data-Driven Insights into Spectrum DemandMohamad Alkadamani, Amir Ghasemi, Halim Yanikomeroglu2026-03-10下载In the diverse landscape of 6G networks, where wireless connectivity demands surge and spectrum resources remain limited, flexible spectrum access becomes paramount.
Role Classification of Hosts within Enterprise Networks Based on Connection PatternsGodfrey Tan, Massimiliano Poletto, John Guttag, Frans Kaashoek2026-03-10下载Role classification involves grouping hosts into related roles. It exposes the logical structure of a network, simplifies network management tasks such as policy checking and network segmentation, and...
The 802.11 MAC protocol leads to inefficient equilibriaGodfrey Tan, John Guttag2026-03-10下载Wireless local area networks (WLANs) based on the family of 802.11 technologies are becoming ubiquitous. These technologies support multiple data transmission rates.
A Graph-Based Approach to Spectrum Demand Prediction Using Hierarchical Attention NetworksMohamad Alkadamani, Halim Yanikomeroglu, Amir Ghasemi2026-03-10下载The surge in wireless connectivity demand, coupled with the finite nature of spectrum resources, compels the development of efficient spectrum management approaches.
PixelConfig: Longitudinal Measurement and Reverse-Engineering of Meta Pixel ConfigurationsAbdullah Ghani, Yash Vekaria, Zubair Shafiq2026-03-10下载Tracking pixels are used to optimize online ad campaigns through personalization, re-targeting, and conversion tracking. Past research has primarily focused on detecting the prevalence of tracking pix...
PPO-Based Hybrid Optimization for RIS-Assisted Semantic Vehicular Edge ComputingWei Feng, Jingbo Zhang, Qiong Wu, Pingyi Fan, Qiang Fan2026-03-10下载To support latency-sensitive Internet of Vehicles (IoV) applications amidst dynamic environments and intermittent links, this paper proposes a Reconfigurable Intelligent Surface (RIS)-aided semantic-a...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Ensuring Data Freshness in Multi-Rate Task Chains SchedulingJosé Luis Conradi Hoffmann, Antônio Augusto Fröhlich2026-03-10下载In safety-critical autonomous systems, data freshness presents a fundamental design challenge. While the Logical Execution Time (LET) paradigm ensures compositional determinism, it often does so at th...
FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource IsolationYinpeng Wu, Yitong Chen, Lixiang Wang, Jinyu Gu, Zhichao Hua, Yubin Xia2026-03-10下载Device-side Large Language Models (LLMs) have witnessed explosive growth, offering higher privacy and availability compared to cloud-side LLMs.

cs.PF - Performance

标题作者发布日期PDF摘要
Multi-DNN Inference of Sparse Models on Edge SoCsJiawei Luo, Di Wu, Simon Dobson, Blesson Varghese2026-03-10下载Modern edge applications increasingly require multi-DNN inference systems to execute tasks on heterogeneous processors, gaining performance from both concurrent execution and from matching each model ...
Compiler-First State Space Duality and Portable O(1)O(1) Autoregressive Caching for InferenceCosmo Santoni2026-03-10下载State-space model releases are typically coupled to fused CUDA and Triton kernels, inheriting a hard dependency on NVIDIA hardware. We show that Mamba-2's state space duality algorithm -- diagonal sta...
Dynamic Precision Math Engine for Linear Algebra and Trigonometry Acceleration on Xtensa LX6 MicrocontrollersElian Alfonso Lopez Preciado2026-03-10下载Low-cost embedded processors such as the ESP32 (Xtensa LX6, 32-bit dual-core, 240 MHz) are increasingly used in edge computing applications that require real-time physical simulation, sensor fusion, a...
Accelerating High-Order Finite Element Simulations at Extreme Scale with FP64 Tensor CoresJiqun Tu, Ian Karlin, John Camier, Veselin Dobrev, Tzanio Kolev, Stefan Henneking, Omar Ghattas2026-03-10下载Finite element simulations play a critical role in a wide range of applications, from automotive design to tsunami modeling and computational electromagnetics.

基于 VitePress 构建