2026-02-27

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Shifting in-DRAM	William C. Tegge, Alex K. Jones	2026-02-27	下载	Processing-in-Memory (PIM) architectures enable computation directly within DRAM and help combat the memory wall problem. Bit-shifting is a fundamental operation that enables PIM applications such as ...
SAILOR: A Scalable and Energy-Efficient Ultra-Lightweight RISC-V for IoT Security	Christian Ewert, Tim Hardow, Melf Fritsch, Leon Dietrich, Henrik Strunck, Rainer Buchty, Mladen Berekovic, Saleh Mulhem	2026-02-27	下载	Recently, RISC-V has contributed to the development of IoT devices, requiring architectures that balance energy efficiency, compact area, and integrated security.
HTM-EAR: Importance-Preserving Tiered Memory with Hybrid Routing under Saturation	Shubham Kumar Singh	2026-02-27	下载	Memory constraints in long-running agents require structured management of accumulated facts while preserving essential information under bounded context limits.
LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model Checking	Mingkai Miao, Guangyu Hu, Wei Zhang, Hongce Zhang	2026-02-27	下载	Property checking of RTL designs is a central task in formal verification. Among available engines, IC3/PDR is a widely used backbone whose performance critically depends on inductive generalization, ...
Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study	Athos Georgiou	2026-02-27	下载	We present a cross-architecture evaluation of production LLM inference on AMD Instinct MI325X GPUs, benchmarking four models spanning 235B to 1 trillion parameters across three architectural families ...
Quantized Precoding for Maximizing Sum Rate in MU-MIMO Systems with Constrained Fronthaul	Yasaman Khorsandmanesh, Alva Kosasih, Emil Björnson, Joakim Jaldén	2026-02-27	下载	This paper studies a downlink multi-user multiple-input multiple-output (MU-MIMO) system, where the precoding matrix is computed at a baseband unit (BBU) and then transmitted to the remote antenna arr...
GenDRAM:Hardware-Software Co-Design of General Platform in DRAM	Tsung-Han Lu, Weihong Xu, Tajana Rosing	2026-02-27	下载	Dynamic programming (DP) algorithms, such as All-Pairs Shortest Path (APSP) and genomic sequence alignment, are fundamental to many scientific domains but are severely bottlenecked by data movement on...
FPPS: An FPGA-Based Point Cloud Processing System	Xiaofeng Zhou, Linfeng Du, Hanwei Fan, Wei Zhang	2026-02-27	下载	Point cloud processing is a computational bottleneck in autonomous driving systems, especially for real-time applications, while energy efficiency remains a critical system constraint.
Micrometer-scale displacement and thickness sensing using a single terahertz resonant-tunneling diode	Li Yi, Shota Ito, Chao Tang, Yousuke Nishida, Koji Terumoto, Toshihisa Maeda, Yuta Inose, Masayuki Fujita	2026-02-27	下载	Resonant tunneling diodes (RTDs) provide room-temperature terahertz oscillation and strong nonlinear mixing, enabling compact monostatic sensors in which a single device acts as both a bias-tunable os...
PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability	Ning Lyu, Yuntao Liu, Yonghong Bai, Zhiyuan Yan	2026-02-27	下载	Knowledge distillation transfers large teacher models to compact student models, enabling deployment on resource-limited platforms while suffering minimal performance degradation.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
SPARe: Stacked Parallelism with Adaptive Reordering for Fault-Tolerant LLM Pretraining Systems with 100k+ GPUs	Jin Lee, Zhonghao Chen, Xuhang He, Robert Underwood, Bogdan Nicolae, Franck Cappello, Xiaoyi Lu, Sheng Di, Zheng Zhang	2026-02-27	下载	In large-scale LLM pre-training systems with 100k+ GPUs, failures become the norm rather than the exception, and restart costs can dominate wall-clock training time.
Token Management in Multi-Tenant AI Inference Platforms	William J. Cunningham	2026-02-27	下载	Multi-tenant AI inference platforms must balance resource utilization against service-level guarantees under variable demand. Conventional approaches fail to achieve this balance: dedicated endpoints ...
CensorLess: Cost-Efficient Censorship Circumvention Through Serverless Cloud Functions	Dayeon Kang, Jade Sheffey, Mingshi Wu, Pubali Datta, Amir Houmansadr	2026-02-27	下载	With the increase in Internet censorship globally, various circumvention tools have been designed and developed. However, the monetary cost of these tools deeply impacts both user choice and the susta...
Vectorized Adaptive Histograms for Sparse Oblique Forests	Ariel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns	2026-02-27	下载	Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types.
nvidia-pcm: A D-Bus-Driven Platform Configuration Manager for OpenBMC Environments	Harinder Singh	2026-02-27	下载	GPU-accelerated server platforms that share most of their hardware architecture often require separate firmware images due to minor hardware differences--different component identifiers, thermal profi...
Advanced Scheduling Strategies for Distributed Quantum Computing Jobs	Gongyu Ni, Davide Ferrari, Lester Ho, Michele Amoretti	2026-02-27	下载	Distributed quantum computing (DQC) is being actively investigated as a means of scaling the number of qubits across multiple connected quantum devices.
Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving	Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral	2026-02-27	下载	Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving systems where hundreds of adapters must be host...
Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study	Athos Georgiou	2026-02-27	下载	We present a cross-architecture evaluation of production LLM inference on AMD Instinct MI325X GPUs, benchmarking four models spanning 235B to 1 trillion parameters across three architectural families ...
Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing	Bowen Sun, Christos D. Antonopoulos, Evgenia Smirni, Bin Ren, Nikolaos Bellas, Spyros Lalis	2026-02-27	下载	Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances,...
Mixed Choice in Asynchronous Multiparty Session Types	Laura Bocchi, Raymond Hu, Adriana Laura Voinea, Simon Thompson	2026-02-27	下载	We present a multiparty session type (MST) framework with asynchronous mixed choice (MC). We propose a core construct for MC that allows transient inconsistencies in protocol state between distributed...
MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models	Tiantong Wang, Xinyu Yan, Tiantong Wu, Yurong Hao, Yong Jiang, Fei Huang, Wei Yang Bryan Lim	2026-02-27	下载	Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set.
Hestia: Hyperthread-Level Scheduling for Cloud Microservices with Interference-Aware Attention	Dingyu Yang, Fanyong Kong, Jie Dai, Shiyou Qian, Shuangwei Li, Jian Cao, Guangtao Xue, Gang Chen	2026-02-27	下载	Modern cloud servers routinely co-locate multiple latency-sensitive microservice instances to improve resource efficiency. However, the diversity of microservice behaviors, coupled with mutual perform...
QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models	Md Hasanur Rashid, Jesun Firoz, Nathan R. Tallent, Luanzheng Guo, Meng Tang, Dong Dai	2026-02-27	下载	With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource s...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
CensorLess: Cost-Efficient Censorship Circumvention Through Serverless Cloud Functions	Dayeon Kang, Jade Sheffey, Mingshi Wu, Pubali Datta, Amir Houmansadr	2026-02-27	下载	With the increase in Internet censorship globally, various circumvention tools have been designed and developed. However, the monetary cost of these tools deeply impacts both user choice and the susta...
Unsupervised Baseline Clustering and Incremental Adaptation for IoT Device Traffic Profiling	Sean M. Alderman, John D. Hastings	2026-02-27	下载	The growth and heterogeneity of IoT devices create security challenges where static identification models can degrade as traffic evolves. This paper presents a two-stage, flow-feature-based pipeline f...
Age of Entanglement in Satellite Repeater Chains with Intermittent Availability	Elif Tugce Ceran	2026-02-27	下载	Timely availability of high-fidelity entanglement is essential for emerging quantum networks. This paper introduces the Age of Entanglement (AoE) as a novel performance metric that captures the freshn...
Performance Comparison of IBN orchestration using LLM and SLMs	Wai Lwin Phone, Brahim El Boudani, Tasos Dagiuklas, Saptarshi Ghosh	2026-02-27	下载	The evolution of both 5G and 6G networks is driving the advancement of fully autonomous network management, placing Intent-Based Networking at the centre of this transformation.
An improved Lower Bound for Local Failover in Directed Networks via Binary Covering Arrays	Erik van den Akker, Klaus-Tycho Foerster	2026-02-27	下载	Communication networks often rely on some form of local failover rules for fast forwarding decisions upon link failures. While on undirected networks, up to two failures can be tolerated, when just ma...
Deep Sleep Scheduling for Satellite IoT via Simulation Based Optimization	Wanja de Sombre, Monika Tomová, Marek Galinski, Anja Klein, Andrea Ortiz	2026-02-27	下载	The Satellite Internet of Things (S-IoT) enables global connectivity for remote sensing devices that must operate energy-efficiently over long time spans.
SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud	Hariz Yet, Nguyen Thanh Tam, Mao V. Ngo, Lim Yi Shen, Lin Wei, Jihong Park, Binbin Chen, Tony Q. S. Quek	2026-02-27	下载	Embodied AI requires sub-second inference near the Radio Access Network (RAN), but deployments span heterogeneous tiers (on-device, RAN-edge, cloud) and must not disrupt real-time baseband processing.
Solving No-wait Scheduling for Time-Sensitive Networks with Daisy-Chain Topology	Qian Li, Henan Liu, Heng Liu, Yuyi Wang	2026-02-27	下载	Time-Sensitive Networking (TSN) is a set of standards aiming to enable deterministic and predictable communication over Ethernet networks. However, as the standards of TSN do not specify how to schedu...
Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent Networks	Ziye Jia, Sijie He, Ligang Yuan, Fuhui Zhou, Qihui Wu, Zhu Han, Dusit Niyato	2026-02-27	下载	Due to the scalability and portability, low-altitude intelligent networks (LAINs) are essential in various fields such as surveillance and disaster rescue.
Toward E2E Intelligence in 6G Networks: An AI Agent-Based RAN-CN Converged Intelligence Framework	Youbin Han, Haneul Ko, Namseok Ko, Tarik Taleb, Yan Chen	2026-02-27	下载	Recent advances in intelligent network control have primarily relied on task-specific Artificial Intelligence (AI) models deployed separately within the Radio Access Network (RAN) and Core Network (CN...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
OBASE: Object-Based Address-Space Engineering to Improve Memory Tiering	Vinay Banakar, Suli Yang, Kan Wu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Kimberly Keeton	2026-02-27	下载	Hardware and OS mechanisms for memory tiering are widely deployed, yet datacenters still overprovision DRAM. The root cause is hotness fragmentation: allocators place objects by size rather than acces...
Token Management in Multi-Tenant AI Inference Platforms	William J. Cunningham	2026-02-27	下载	Multi-tenant AI inference platforms must balance resource utilization against service-level guarantees under variable demand. Conventional approaches fail to achieve this balance: dedicated endpoints ...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Vectorized Adaptive Histograms for Sparse Oblique Forests	Ariel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns	2026-02-27	下载	Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types.
Advanced Scheduling Strategies for Distributed Quantum Computing Jobs	Gongyu Ni, Davide Ferrari, Lester Ho, Michele Amoretti	2026-02-27	下载	Distributed quantum computing (DQC) is being actively investigated as a means of scaling the number of qubits across multiple connected quantum devices.
Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing	Bowen Sun, Christos D. Antonopoulos, Evgenia Smirni, Bin Ren, Nikolaos Bellas, Spyros Lalis	2026-02-27	下载	Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances,...
QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models	Md Hasanur Rashid, Jesun Firoz, Nathan R. Tallent, Luanzheng Guo, Meng Tang, Dong Dai	2026-02-27	下载	With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource s...