Skip to content

2026-02-27

cs.AR - Architecture

标题作者发布日期PDF摘要
Shifting in-DRAMWilliam C. Tegge, Alex K. Jones2026-02-27下载Processing-in-Memory (PIM) architectures enable computation directly within DRAM and help combat the memory wall problem. Bit-shifting is a fundamental operation that enables PIM applications such as ...
SAILOR: A Scalable and Energy-Efficient Ultra-Lightweight RISC-V for IoT SecurityChristian Ewert, Tim Hardow, Melf Fritsch, Leon Dietrich, Henrik Strunck, Rainer Buchty, Mladen Berekovic, Saleh Mulhem2026-02-27下载Recently, RISC-V has contributed to the development of IoT devices, requiring architectures that balance energy efficiency, compact area, and integrated security.
HTM-EAR: Importance-Preserving Tiered Memory with Hybrid Routing under SaturationShubham Kumar Singh2026-02-27下载Memory constraints in long-running agents require structured management of accumulated facts while preserving essential information under bounded context limits.
LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model CheckingMingkai Miao, Guangyu Hu, Wei Zhang, Hongce Zhang2026-02-27下载Property checking of RTL designs is a central task in formal verification. Among available engines, IC3/PDR is a widely used backbone whose performance critically depends on inductive generalization, ...
Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment StudyAthos Georgiou2026-02-27下载We present a cross-architecture evaluation of production LLM inference on AMD Instinct MI325X GPUs, benchmarking four models spanning 235B to 1 trillion parameters across three architectural families ...
Quantized Precoding for Maximizing Sum Rate in MU-MIMO Systems with Constrained FronthaulYasaman Khorsandmanesh, Alva Kosasih, Emil Björnson, Joakim Jaldén2026-02-27下载This paper studies a downlink multi-user multiple-input multiple-output (MU-MIMO) system, where the precoding matrix is computed at a baseband unit (BBU) and then transmitted to the remote antenna arr...
GenDRAM:Hardware-Software Co-Design of General Platform in DRAMTsung-Han Lu, Weihong Xu, Tajana Rosing2026-02-27下载Dynamic programming (DP) algorithms, such as All-Pairs Shortest Path (APSP) and genomic sequence alignment, are fundamental to many scientific domains but are severely bottlenecked by data movement on...
FPPS: An FPGA-Based Point Cloud Processing SystemXiaofeng Zhou, Linfeng Du, Hanwei Fan, Wei Zhang2026-02-27下载Point cloud processing is a computational bottleneck in autonomous driving systems, especially for real-time applications, while energy efficiency remains a critical system constraint.
Micrometer-scale displacement and thickness sensing using a single terahertz resonant-tunneling diodeLi Yi, Shota Ito, Chao Tang, Yousuke Nishida, Koji Terumoto, Toshihisa Maeda, Yuta Inose, Masayuki Fujita2026-02-27下载Resonant tunneling diodes (RTDs) provide room-temperature terahertz oscillation and strong nonlinear mixing, enabling compact monostatic sensors in which a single device acts as both a bias-tunable os...
PDF: PUF-based DNN Fingerprinting for Knowledge Distillation TraceabilityNing Lyu, Yuntao Liu, Yonghong Bai, Zhiyuan Yan2026-02-27下载Knowledge distillation transfers large teacher models to compact student models, enabling deployment on resource-limited platforms while suffering minimal performance degradation.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
SPARe: Stacked Parallelism with Adaptive Reordering for Fault-Tolerant LLM Pretraining Systems with 100k+ GPUsJin Lee, Zhonghao Chen, Xuhang He, Robert Underwood, Bogdan Nicolae, Franck Cappello, Xiaoyi Lu, Sheng Di, Zheng Zhang2026-02-27下载In large-scale LLM pre-training systems with 100k+ GPUs, failures become the norm rather than the exception, and restart costs can dominate wall-clock training time.
Token Management in Multi-Tenant AI Inference PlatformsWilliam J. Cunningham2026-02-27下载Multi-tenant AI inference platforms must balance resource utilization against service-level guarantees under variable demand. Conventional approaches fail to achieve this balance: dedicated endpoints ...
CensorLess: Cost-Efficient Censorship Circumvention Through Serverless Cloud FunctionsDayeon Kang, Jade Sheffey, Mingshi Wu, Pubali Datta, Amir Houmansadr2026-02-27下载With the increase in Internet censorship globally, various circumvention tools have been designed and developed. However, the monetary cost of these tools deeply impacts both user choice and the susta...
Vectorized Adaptive Histograms for Sparse Oblique ForestsAriel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns2026-02-27下载Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types.
nvidia-pcm: A D-Bus-Driven Platform Configuration Manager for OpenBMC EnvironmentsHarinder Singh2026-02-27下载GPU-accelerated server platforms that share most of their hardware architecture often require separate firmware images due to minor hardware differences--different component identifiers, thermal profi...
Advanced Scheduling Strategies for Distributed Quantum Computing JobsGongyu Ni, Davide Ferrari, Lester Ho, Michele Amoretti2026-02-27下载Distributed quantum computing (DQC) is being actively investigated as a means of scaling the number of qubits across multiple connected quantum devices.
Data Driven Optimization of GPU efficiency for Distributed LLM Adapter ServingFerran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral2026-02-27下载Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving systems where hundreds of adapters must be host...
Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment StudyAthos Georgiou2026-02-27下载We present a cross-architecture evaluation of production LLM inference on AMD Instinct MI325X GPUs, benchmarking four models spanning 235B to 1 trillion parameters across three architectural families ...
Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless ComputingBowen Sun, Christos D. Antonopoulos, Evgenia Smirni, Bin Ren, Nikolaos Bellas, Spyros Lalis2026-02-27下载Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances,...
Mixed Choice in Asynchronous Multiparty Session TypesLaura Bocchi, Raymond Hu, Adriana Laura Voinea, Simon Thompson2026-02-27下载We present a multiparty session type (MST) framework with asynchronous mixed choice (MC). We propose a core construct for MC that allows transient inconsistencies in protocol state between distributed...
MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language ModelsTiantong Wang, Xinyu Yan, Tiantong Wu, Yurong Hao, Yong Jiang, Fei Huang, Wei Yang Bryan Lim2026-02-27下载Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set.
Hestia: Hyperthread-Level Scheduling for Cloud Microservices with Interference-Aware AttentionDingyu Yang, Fanyong Kong, Jie Dai, Shiyou Qian, Shuangwei Li, Jian Cao, Guangtao Xue, Gang Chen2026-02-27下载Modern cloud servers routinely co-locate multiple latency-sensitive microservice instances to improve resource efficiency. However, the diversity of microservice behaviors, coupled with mutual perform...
QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity ModelsMd Hasanur Rashid, Jesun Firoz, Nathan R. Tallent, Luanzheng Guo, Meng Tang, Dong Dai2026-02-27下载With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource s...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
CensorLess: Cost-Efficient Censorship Circumvention Through Serverless Cloud FunctionsDayeon Kang, Jade Sheffey, Mingshi Wu, Pubali Datta, Amir Houmansadr2026-02-27下载With the increase in Internet censorship globally, various circumvention tools have been designed and developed. However, the monetary cost of these tools deeply impacts both user choice and the susta...
Unsupervised Baseline Clustering and Incremental Adaptation for IoT Device Traffic ProfilingSean M. Alderman, John D. Hastings2026-02-27下载The growth and heterogeneity of IoT devices create security challenges where static identification models can degrade as traffic evolves. This paper presents a two-stage, flow-feature-based pipeline f...
Age of Entanglement in Satellite Repeater Chains with Intermittent AvailabilityElif Tugce Ceran2026-02-27下载Timely availability of high-fidelity entanglement is essential for emerging quantum networks. This paper introduces the Age of Entanglement (AoE) as a novel performance metric that captures the freshn...
Performance Comparison of IBN orchestration using LLM and SLMsWai Lwin Phone, Brahim El Boudani, Tasos Dagiuklas, Saptarshi Ghosh2026-02-27下载The evolution of both 5G and 6G networks is driving the advancement of fully autonomous network management, placing Intent-Based Networking at the centre of this transformation.
An improved Lower Bound for Local Failover in Directed Networks via Binary Covering ArraysErik van den Akker, Klaus-Tycho Foerster2026-02-27下载Communication networks often rely on some form of local failover rules for fast forwarding decisions upon link failures. While on undirected networks, up to two failures can be tolerated, when just ma...
Deep Sleep Scheduling for Satellite IoT via Simulation Based OptimizationWanja de Sombre, Monika Tomová, Marek Galinski, Anja Klein, Andrea Ortiz2026-02-27下载The Satellite Internet of Things (S-IoT) enables global connectivity for remote sensing devices that must operate energy-efficiently over long time spans.
SLA-Aware Distributed LLM Inference Across Device-RAN-CloudHariz Yet, Nguyen Thanh Tam, Mao V. Ngo, Lim Yi Shen, Lin Wei, Jihong Park, Binbin Chen, Tony Q. S. Quek2026-02-27下载Embodied AI requires sub-second inference near the Radio Access Network (RAN), but deployments span heterogeneous tiers (on-device, RAN-edge, cloud) and must not disrupt real-time baseband processing.
Solving No-wait Scheduling for Time-Sensitive Networks with Daisy-Chain TopologyQian Li, Henan Liu, Heng Liu, Yuyi Wang2026-02-27下载Time-Sensitive Networking (TSN) is a set of standards aiming to enable deterministic and predictable communication over Ethernet networks. However, as the standards of TSN do not specify how to schedu...
Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent NetworksZiye Jia, Sijie He, Ligang Yuan, Fuhui Zhou, Qihui Wu, Zhu Han, Dusit Niyato2026-02-27下载Due to the scalability and portability, low-altitude intelligent networks (LAINs) are essential in various fields such as surveillance and disaster rescue.
Toward E2E Intelligence in 6G Networks: An AI Agent-Based RAN-CN Converged Intelligence FrameworkYoubin Han, Haneul Ko, Namseok Ko, Tarik Taleb, Yan Chen2026-02-27下载Recent advances in intelligent network control have primarily relied on task-specific Artificial Intelligence (AI) models deployed separately within the Radio Access Network (RAN) and Core Network (CN...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
OBASE: Object-Based Address-Space Engineering to Improve Memory TieringVinay Banakar, Suli Yang, Kan Wu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Kimberly Keeton2026-02-27下载Hardware and OS mechanisms for memory tiering are widely deployed, yet datacenters still overprovision DRAM. The root cause is hotness fragmentation: allocators place objects by size rather than acces...
Token Management in Multi-Tenant AI Inference PlatformsWilliam J. Cunningham2026-02-27下载Multi-tenant AI inference platforms must balance resource utilization against service-level guarantees under variable demand. Conventional approaches fail to achieve this balance: dedicated endpoints ...

cs.PF - Performance

标题作者发布日期PDF摘要
Vectorized Adaptive Histograms for Sparse Oblique ForestsAriel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns2026-02-27下载Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types.
Advanced Scheduling Strategies for Distributed Quantum Computing JobsGongyu Ni, Davide Ferrari, Lester Ho, Michele Amoretti2026-02-27下载Distributed quantum computing (DQC) is being actively investigated as a means of scaling the number of qubits across multiple connected quantum devices.
Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless ComputingBowen Sun, Christos D. Antonopoulos, Evgenia Smirni, Bin Ren, Nikolaos Bellas, Spyros Lalis2026-02-27下载Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances,...
QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity ModelsMd Hasanur Rashid, Jesun Firoz, Nathan R. Tallent, Luanzheng Guo, Meng Tang, Dong Dai2026-02-27下载With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource s...

基于 VitePress 构建