Skip to content

2026-02-12

cs.AR - Architecture

标题作者发布日期PDF摘要
MXFormer: A Microscaling Floating-Point Charge-Trap Transistor Compute-in-Memory Transformer AcceleratorGeorge Karfakis, Samyak Chakrabarty, Vinod Kurian Jacob, Siyun Qiao, Subramanian S. Iyer, Sudhakar Pamarti, Puneet Gupta2026-02-12下载The proliferation of Transformer models is often constrained by the significant computational and memory bandwidth demands of deployment. To address this, we present MXFormer, a novel, hybrid, weight-...
CacheMind: From Miss Rates to Why -- Natural-Language, Trace-Grounded Reasoning for Cache ReplacementKaushal Mhapsekar, Azam Ghanbari, Bita Aslrousta, Samira Mirbagher-Ajorpaz2026-02-12下载Cache replacement remains a challenging problem in CPU microarchitecture, often addressed using hand-crafted heuristics, limiting cache performance.
MING: An Automated CNN-to-Edge MLIR HLS frameworkJiahong Bi, Lars Schütze, Jeronimo Castrillon2026-02-12下载Driven by the increasing demand for low-latency and real-time processing, machine learning applications are steadily migrating toward edge computing platforms, where Field-Programmable Gate Arrays (FP...
ALADIN: Accuracy-Latency-Aware Design-space Inference Analysis for Embedded AI AcceleratorsT. Baldi, D. Casini, A. Biondi2026-02-12下载The inference of deep neural networks (DNNs) on resource-constrained embedded systems introduces non-trivial trade-offs among model accuracy, computational latency, and hardware limitations, particula...
Device-Circuit Co-Design of Variation-Resilient Read and Write Drivers for Antiferromagnetic Tunnel Junction (AFMTJ) MemoriesYousuf Choudhary, Tosiron Adegbija2026-02-12下载Antiferromagnetic Tunnel Junctions (AFMTJs) offer picosecond switching and high integration density for in-memory computing, but their ultrafast dynamics and low tunnel magnetoresistance (TMR) make st...
Benchmarking for Single Feature Attribution with Microarchitecture CliffsHao Zhen, Qingxuan Kang, Yungang Bao, Trevor E. Carlson2026-02-12下载Architectural simulators play a critical role in early microarchitectural exploration due to their flexibility and high productivity. However, their effectiveness is often constrained by fidelity: sim...
PASCAL: A Phase-Aware Scheduling Algorithm for Serving Reasoning-based Large Language ModelsEunyeong Cho, Jehyeon Bang, Ranggi Hwang, Minsoo Rhu2026-02-12下载The emergence of reasoning-based LLMs leveraging Chain-of-Thought (CoT) inference introduces new serving challenges, as their extended reasoning phases delay user-visible output and inflate Time-To-Fi...
PAM: Processing Across Memory Hierarchy for Efficient KV-centric LLM Serving SystemLian Liu, Shixin Zhao, Yutian Zhou, Yintao He, Mengdi Wang, Yinhe Han, Ying Wang2026-02-12下载The widespread adoption of Large Language Models (LLMs) has exponentially increased the demand for efficient serving systems. With growing requests and context lengths, key-value (KV)-related operatio...
RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline AnalysisZhen Bi, Xueshu Chen, Luoyang Sun, Yuhang Yao, Qing Shen, Jungang Lou, Cheng Deng2026-02-12下载The transition toward localized intelligence through Small Language Models (SLMs) has intensified the need for rigorous performance characterization on resource-constrained edge hardware.
EM-Aware Physical Synthesis: Neural Inductor Modeling and Intelligent Placement & Routing for RF CircuitsYilun Huang, Asal Mehradfar, Salman Avestimehr, Hamidreza Aghasi2026-02-12下载This paper presents an ML-driven framework for automated RF physical synthesis that transforms circuit netlists into manufacturable GDSII layouts.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
DRAMatic Speedup: Accelerating HE Operations on a Processing-in-Memory SystemNiklas Klinger, Jonas Sander, Peterson Yuhala, Pascal Felber, Thomas Eisenbarth2026-02-12下载Homomorphic encryption (HE) is a promising technology for confidential cloud computing, as it allows computations on encrypted data. However, HE is computationally expensive and often memory-bound on ...
Legitimate Overrides in Decentralized ProtocolsOghenekaro Elem, Nimrod Talmon2026-02-12下载Decentralized protocols claim immutable, rule-based execution, yet many embed emergency mechanisms such as chain-level freezes, protocol pauses, and account quarantines.
OServe: Accelerating LLM Serving via Spatial-Temporal Workload OrchestrationYouhe Jiang, Fangcheng Fu, Taiyi Wang, Guoliang He, Eiko Yoneki2026-02-12下载Serving Large Language Models (LLMs) can benefit immensely from parallelizing both the model and input requests across multiple devices, but incoming workloads exhibit substantial spatial and temporal...
Contention Resolution, With and Without a Global ClockZixi Cai, Kuowen Chen, Shengquan Du, Tsvi Kopelowitz, Seth Pettie, Ben Plosk2026-02-12下载In the Contention Resolution problem nn parties each wish to have exclusive use of a shared resource for one unit of time. The problem has been studied since the early 1970s, under a variety of assum...
PrefillShare: A Shared Prefill Module for KV Reuse in Multi-LLM Disaggregated ServingSunghyeon Woo, Hoseung Kim, Sunghwan Shim, Minjung Jo, Hyunjoon Jeong, Jeongtae Lee, Joonghoon Kim, Sungjae Lee, Baeseong Park, Se Jung Kwon, Dongsoo Lee2026-02-12下载Multi-agent systems increasingly orchestrate multiple specialized language models to solve complex real-world problems, often invoking them over a shared context.
An Auction-Based Mechanism for Optimal Task Allocation and Resource Aware ContainerizationRamakant kumar2026-02-12下载Distributed computing has enabled cooperation between multiple computing devices for the simultaneous execution of resource-hungry tasks. Such execution also plays a pivotal role in the parallel execu...
MUSE: Multi-Tenant Model Serving With Seamless Model UpdatesCláudio Correia, Alberto E. A. Ferreira, Lucas Martins, Miguel P. Bento, Sofia Guerreiro, Ricardo Ribeiro Pereira, Ana Sofia Gomes, Jacopo Bono, Hugo Ferreira, Pedro Bizarro2026-02-12下载In binary classification systems, decision thresholds translate model scores into actions. Choosing suitable thresholds relies on the specific distribution of the underlying model scores but also on t...
Designing Scalable Rate Limiting Systems: Algorithms, Architecture, and Distributed SolutionsBo Guan2026-02-12下载Designing a rate limiter that is simultaneously accurate, available, and scalable presents a fundamental challenge in distributed systems, primarily due to the trade-offs between algorithmic precision...
GORGO: Maximizing KV-Cache Reuse While Minimizing Network Latency in Cross-Region LLM Load BalancingAlessio Ricci Toniolo, Abinaya Dinesh, Rome Thorstenson2026-02-12下载Distributing LLM inference across geographical regions can improve Time-to-First-Token (TTFT) by regionalizing service deployments. While existing multi-region load balancers save prefill computation ...
LAER-MoE: Load-Adaptive Expert Re-layout for Efficient Mixture-of-Experts TrainingXinyi Liu, Yujie Wang, Fangcheng Fu, Xuefeng Xiao, Huixia Li, Jiashi Li, Bin Cui2026-02-12下载Expert parallelism is vital for effectively training Mixture-of-Experts (MoE) models, enabling different devices to host distinct experts, with each device processing different input data.
LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware DetectionChristian Rondanini, Barbara Carminati, Elena Ferrari, Niccolò Lardo, Ashish Kundu2026-02-12下载The proliferation of edge devices has created an urgent need for security solutions capable of detecting malware in real time while operating under strict computational and memory constraints.
OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel OptimizationArijit Bhattacharjee, Heng Ping, Son Vu Le, Paul Bogdan, Nesreen K. Ahmed, Ali Jannesari2026-02-12下载Generating high-performance CUDA kernels remains challenging due to the need to navigate a combinatorial space of low-level transformations under noisy and expensive hardware feedback.
Differentially Private Perturbed Push-Sum Protocol and Its Application in Non-Convex OptimizationYiming Zhou, Kaiping Xue, Enhong Chen2026-02-12下载In decentralized networks, nodes cannot ensure that their shared information will be securely preserved by their neighbors, making privacy vulnerable to inference by curious nodes.
PAM: Processing Across Memory Hierarchy for Efficient KV-centric LLM Serving SystemLian Liu, Shixin Zhao, Yutian Zhou, Yintao He, Mengdi Wang, Yinhe Han, Ying Wang2026-02-12下载The widespread adoption of Large Language Models (LLMs) has exponentially increased the demand for efficient serving systems. With growing requests and context lengths, key-value (KV)-related operatio...
Future Mining: Learning for Safety and SecurityMd Sazedur Rahman, Mizanur Rahman Jewel, Sanjay Madria2026-02-12下载Mining is rapidly evolving into an AI driven cyber physical ecosystem where safety and operational reliability depend on robust perception, trustworthy distributed intelligence, and continuous monitor...
RL over Commodity Networks: Overcoming the Bandwidth Barrier with Lossless Sparse DeltasChaoyi Ruan, Geng Luo, Xinyi Wan, Long Zhao, Qinghe Wang, Jiaan Zhu, Duling Xu, Guanbin Xu, Dehui Wei, Xiang Liu, Cheng Li, Haifeng Sun, Congcong Miao, Jialin Li2026-02-12下载LLM post-training with reinforcement learning (RL) requires frequent synchronization of large model parameters between the trainer and distributed rollout actors.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Generalizing UxV Network Control Optimization with Disruption Tolerant NetworkingQuyen Dang, Geoffrey Xie2026-02-12下载Military and disaster relief operations increasingly rely on unmanned vehicles (UxVs). It is important to develop a network control system (NCS) that can continuously coordinate and optimize the movem...
Tracking The Trackers: Commercial Surveillance Occurring on U.S. Army NetworksAlexander Master, Jaclyn Fox, Nicolas Starck, Maxwell Love, Benjamin Allison2026-02-12下载Despite current security implementations, Internet activity on DoD networks is susceptible to web trackers and commercial data collection, which have the potential to expose information about service ...
On Borrowed Time: Measurement-Informed Understanding of the NTP Pool's Robustness to Monopoly AttacksRobert Beverly, Erik Rye2026-02-12下载Internet services and applications depend critically on the availability and acc uracy of network time. The Network Time Protocol (NTP) is one of the oldest core network protocols and remains the de f...
Transmit or Idle: Efficient AoI Optimal Transmission Policy for Gossiping ReceiversIrtiza Hasan, Ahmed Arafa2026-02-12下载We study the optimal transmission and scheduling policy for a transmitter (source) communicating with two gossiping receivers aiming at tracking the source's status over time using the age of informat...
6G Empowering Future Robotics: A Vision for Next-Generation Autonomous SystemsMona Ghassemian, Andrés Meseguer Valenzuela, Ana Garcia Armada, Dejan Vukobratovic, Periklis Chatzimisios, Kaspar Althoefer, Ranga Rao Venkatesha Prasad2026-02-12下载The convergence of robotics and next-generation communication is a critical driver of technological advancement. As the world transitions from 5G to 6G, the foundational capabilities of wireless netwo...
3D Wi-Fi Signal Measurement in Realistic Digital Twin Testbed Environments Using Ray TracingMengyuan Wang, Haopeng Wang, Haiwei Dong, Abdulmotaleb El Saddik2026-02-12下载Accurate and efficient modeling of indoor wireless signal propagation is crucial for the deployment of next-generation Wi-Fi. This paper presents a digital twin-based measurement system that integrate...
Evaluation of Security-Induced Latency on 5G RAN Interfaces and User Plane CommunicationSotiris Michaelides, Jakub Lapawa, Daniel Eguiguren Chavez, Martin Henze2026-02-12下载5G promises enhanced performance-not only in bandwidth and capacity, but also latency and security. Its ultra-reliable low-latency configuration targets round-trip times below 1 ms, while optional sec...
An Auction-Based Mechanism for Optimal Task Allocation and Resource Aware ContainerizationRamakant kumar2026-02-12下载Distributed computing has enabled cooperation between multiple computing devices for the simultaneous execution of resource-hungry tasks. Such execution also plays a pivotal role in the parallel execu...
Resource-Adaptive Teleportation Under Imperfect Entanglement: A Code-Puncturing FrameworkMahmoud Saad Abouamer, Jaron Skovsted Gundersen, Søren Pilegaard Rasmussen, Petar Popovski2026-02-12下载Quantum teleportation is a foundational protocol for sending quantum information through entanglement distribution and classical communication.
Real-World Asset Integration in Next-Generation Communication Networks: Fundamental, Framework, and Case StudyTingxuan Su, Haoxiang Luo, Ruichen Zhang, Yinqiu Liu, Gang Sun, Hongfang Yu, Dusit Niyato2026-02-12下载Next-generation communication networks are characterized by integrated ultra-high reliability, ultra-low latency, massive connectivity, and ubiquitous coverage.
Reliable and Private Anonymous Routing for Satellite ConstellationsNilesh Vyas, Fabien Geyer, Svetoslav Duhovnikov2026-02-12下载Shared, dynamic network infrastructures, such as dual-use LEO satellite constellations, pose critical threats to metadata privacy, particularly for state actors operating in mixed-trust environments.
GORGO: Maximizing KV-Cache Reuse While Minimizing Network Latency in Cross-Region LLM Load BalancingAlessio Ricci Toniolo, Abinaya Dinesh, Rome Thorstenson2026-02-12下载Distributing LLM inference across geographical regions can improve Time-to-First-Token (TTFT) by regionalizing service deployments. While existing multi-region load balancers save prefill computation ...
C-POD: An AWS Cloud Framework for Edge Pod Automation and Remote Wireless Testbed SharingAnnoy Dey, Vineet Sreeram, Gokkul Eraivan Arutkani Aiyanathan, Maxwell McManus, Yuqing Cui, Guanying Sun, Elizabeth Serena Bentley, Nicholas Mastronarde, Zhangyu Guan2026-02-12下载This paper presents C-POD, a cloud-native framework that automates the deployment and management of edge pods for seamless remote access and sharing of wireless testbeds.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Bounded Local Generator Classes for Deterministic State EvolutionR. Jay Martin2026-02-12下载We formalize a constructive subclass of locality-preserving deterministic operators acting on graph-indexed state systems. We define the class of Bounded Local Generator Classes (BLGC), consisting of ...

cs.PF - Performance

标题作者发布日期PDF摘要
Designing Scalable Rate Limiting Systems: Algorithms, Architecture, and Distributed SolutionsBo Guan2026-02-12下载Designing a rate limiter that is simultaneously accurate, available, and scalable presents a fundamental challenge in distributed systems, primarily due to the trade-offs between algorithmic precision...
RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline AnalysisZhen Bi, Xueshu Chen, Luoyang Sun, Yuhang Yao, Qing Shen, Jungang Lou, Cheng Deng2026-02-12下载The transition toward localized intelligence through Small Language Models (SLMs) has intensified the need for rigorous performance characterization on resource-constrained edge hardware.

基于 VitePress 构建