Appearance
2026-02-27
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Shifting in-DRAM | William C. Tegge, Alex K. Jones | 2026-02-27 | 下载 | Processing-in-Memory (PIM) architectures enable computation directly within DRAM and help combat the memory wall problem. Bit-shifting is a fundamental operation that enables PIM applications such as ... |
| SAILOR: A Scalable and Energy-Efficient Ultra-Lightweight RISC-V for IoT Security | Christian Ewert, Tim Hardow, Melf Fritsch, Leon Dietrich, Henrik Strunck, Rainer Buchty, Mladen Berekovic, Saleh Mulhem | 2026-02-27 | 下载 | Recently, RISC-V has contributed to the development of IoT devices, requiring architectures that balance energy efficiency, compact area, and integrated security. |
| HTM-EAR: Importance-Preserving Tiered Memory with Hybrid Routing under Saturation | Shubham Kumar Singh | 2026-02-27 | 下载 | Memory constraints in long-running agents require structured management of accumulated facts while preserving essential information under bounded context limits. |
| LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model Checking | Mingkai Miao, Guangyu Hu, Wei Zhang, Hongce Zhang | 2026-02-27 | 下载 | Property checking of RTL designs is a central task in formal verification. Among available engines, IC3/PDR is a widely used backbone whose performance critically depends on inductive generalization, ... |
| Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study | Athos Georgiou | 2026-02-27 | 下载 | We present a cross-architecture evaluation of production LLM inference on AMD Instinct MI325X GPUs, benchmarking four models spanning 235B to 1 trillion parameters across three architectural families ... |
| Quantized Precoding for Maximizing Sum Rate in MU-MIMO Systems with Constrained Fronthaul | Yasaman Khorsandmanesh, Alva Kosasih, Emil Björnson, Joakim Jaldén | 2026-02-27 | 下载 | This paper studies a downlink multi-user multiple-input multiple-output (MU-MIMO) system, where the precoding matrix is computed at a baseband unit (BBU) and then transmitted to the remote antenna arr... |
| GenDRAM:Hardware-Software Co-Design of General Platform in DRAM | Tsung-Han Lu, Weihong Xu, Tajana Rosing | 2026-02-27 | 下载 | Dynamic programming (DP) algorithms, such as All-Pairs Shortest Path (APSP) and genomic sequence alignment, are fundamental to many scientific domains but are severely bottlenecked by data movement on... |
| FPPS: An FPGA-Based Point Cloud Processing System | Xiaofeng Zhou, Linfeng Du, Hanwei Fan, Wei Zhang | 2026-02-27 | 下载 | Point cloud processing is a computational bottleneck in autonomous driving systems, especially for real-time applications, while energy efficiency remains a critical system constraint. |
| Micrometer-scale displacement and thickness sensing using a single terahertz resonant-tunneling diode | Li Yi, Shota Ito, Chao Tang, Yousuke Nishida, Koji Terumoto, Toshihisa Maeda, Yuta Inose, Masayuki Fujita | 2026-02-27 | 下载 | Resonant tunneling diodes (RTDs) provide room-temperature terahertz oscillation and strong nonlinear mixing, enabling compact monostatic sensors in which a single device acts as both a bias-tunable os... |
| PDF: PUF-based DNN Fingerprinting for Knowledge Distillation Traceability | Ning Lyu, Yuntao Liu, Yonghong Bai, Zhiyuan Yan | 2026-02-27 | 下载 | Knowledge distillation transfers large teacher models to compact student models, enabling deployment on resource-limited platforms while suffering minimal performance degradation. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SPARe: Stacked Parallelism with Adaptive Reordering for Fault-Tolerant LLM Pretraining Systems with 100k+ GPUs | Jin Lee, Zhonghao Chen, Xuhang He, Robert Underwood, Bogdan Nicolae, Franck Cappello, Xiaoyi Lu, Sheng Di, Zheng Zhang | 2026-02-27 | 下载 | In large-scale LLM pre-training systems with 100k+ GPUs, failures become the norm rather than the exception, and restart costs can dominate wall-clock training time. |
| Token Management in Multi-Tenant AI Inference Platforms | William J. Cunningham | 2026-02-27 | 下载 | Multi-tenant AI inference platforms must balance resource utilization against service-level guarantees under variable demand. Conventional approaches fail to achieve this balance: dedicated endpoints ... |
| CensorLess: Cost-Efficient Censorship Circumvention Through Serverless Cloud Functions | Dayeon Kang, Jade Sheffey, Mingshi Wu, Pubali Datta, Amir Houmansadr | 2026-02-27 | 下载 | With the increase in Internet censorship globally, various circumvention tools have been designed and developed. However, the monetary cost of these tools deeply impacts both user choice and the susta... |
| Vectorized Adaptive Histograms for Sparse Oblique Forests | Ariel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns | 2026-02-27 | 下载 | Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types. |
| nvidia-pcm: A D-Bus-Driven Platform Configuration Manager for OpenBMC Environments | Harinder Singh | 2026-02-27 | 下载 | GPU-accelerated server platforms that share most of their hardware architecture often require separate firmware images due to minor hardware differences--different component identifiers, thermal profi... |
| Advanced Scheduling Strategies for Distributed Quantum Computing Jobs | Gongyu Ni, Davide Ferrari, Lester Ho, Michele Amoretti | 2026-02-27 | 下载 | Distributed quantum computing (DQC) is being actively investigated as a means of scaling the number of qubits across multiple connected quantum devices. |
| Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving | Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral | 2026-02-27 | 下载 | Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving systems where hundreds of adapters must be host... |
| Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study | Athos Georgiou | 2026-02-27 | 下载 | We present a cross-architecture evaluation of production LLM inference on AMD Instinct MI325X GPUs, benchmarking four models spanning 235B to 1 trillion parameters across three architectural families ... |
| Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing | Bowen Sun, Christos D. Antonopoulos, Evgenia Smirni, Bin Ren, Nikolaos Bellas, Spyros Lalis | 2026-02-27 | 下载 | Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances,... |
| Mixed Choice in Asynchronous Multiparty Session Types | Laura Bocchi, Raymond Hu, Adriana Laura Voinea, Simon Thompson | 2026-02-27 | 下载 | We present a multiparty session type (MST) framework with asynchronous mixed choice (MC). We propose a core construct for MC that allows transient inconsistencies in protocol state between distributed... |
| MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models | Tiantong Wang, Xinyu Yan, Tiantong Wu, Yurong Hao, Yong Jiang, Fei Huang, Wei Yang Bryan Lim | 2026-02-27 | 下载 | Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. |
| Hestia: Hyperthread-Level Scheduling for Cloud Microservices with Interference-Aware Attention | Dingyu Yang, Fanyong Kong, Jie Dai, Shiyou Qian, Shuangwei Li, Jian Cao, Guangtao Xue, Gang Chen | 2026-02-27 | 下载 | Modern cloud servers routinely co-locate multiple latency-sensitive microservice instances to improve resource efficiency. However, the diversity of microservice behaviors, coupled with mutual perform... |
| QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models | Md Hasanur Rashid, Jesun Firoz, Nathan R. Tallent, Luanzheng Guo, Meng Tang, Dong Dai | 2026-02-27 | 下载 | With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource s... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CensorLess: Cost-Efficient Censorship Circumvention Through Serverless Cloud Functions | Dayeon Kang, Jade Sheffey, Mingshi Wu, Pubali Datta, Amir Houmansadr | 2026-02-27 | 下载 | With the increase in Internet censorship globally, various circumvention tools have been designed and developed. However, the monetary cost of these tools deeply impacts both user choice and the susta... |
| Unsupervised Baseline Clustering and Incremental Adaptation for IoT Device Traffic Profiling | Sean M. Alderman, John D. Hastings | 2026-02-27 | 下载 | The growth and heterogeneity of IoT devices create security challenges where static identification models can degrade as traffic evolves. This paper presents a two-stage, flow-feature-based pipeline f... |
| Age of Entanglement in Satellite Repeater Chains with Intermittent Availability | Elif Tugce Ceran | 2026-02-27 | 下载 | Timely availability of high-fidelity entanglement is essential for emerging quantum networks. This paper introduces the Age of Entanglement (AoE) as a novel performance metric that captures the freshn... |
| Performance Comparison of IBN orchestration using LLM and SLMs | Wai Lwin Phone, Brahim El Boudani, Tasos Dagiuklas, Saptarshi Ghosh | 2026-02-27 | 下载 | The evolution of both 5G and 6G networks is driving the advancement of fully autonomous network management, placing Intent-Based Networking at the centre of this transformation. |
| An improved Lower Bound for Local Failover in Directed Networks via Binary Covering Arrays | Erik van den Akker, Klaus-Tycho Foerster | 2026-02-27 | 下载 | Communication networks often rely on some form of local failover rules for fast forwarding decisions upon link failures. While on undirected networks, up to two failures can be tolerated, when just ma... |
| Deep Sleep Scheduling for Satellite IoT via Simulation Based Optimization | Wanja de Sombre, Monika Tomová, Marek Galinski, Anja Klein, Andrea Ortiz | 2026-02-27 | 下载 | The Satellite Internet of Things (S-IoT) enables global connectivity for remote sensing devices that must operate energy-efficiently over long time spans. |
| SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud | Hariz Yet, Nguyen Thanh Tam, Mao V. Ngo, Lim Yi Shen, Lin Wei, Jihong Park, Binbin Chen, Tony Q. S. Quek | 2026-02-27 | 下载 | Embodied AI requires sub-second inference near the Radio Access Network (RAN), but deployments span heterogeneous tiers (on-device, RAN-edge, cloud) and must not disrupt real-time baseband processing. |
| Solving No-wait Scheduling for Time-Sensitive Networks with Daisy-Chain Topology | Qian Li, Henan Liu, Heng Liu, Yuyi Wang | 2026-02-27 | 下载 | Time-Sensitive Networking (TSN) is a set of standards aiming to enable deterministic and predictable communication over Ethernet networks. However, as the standards of TSN do not specify how to schedu... |
| Blockchain-Enabled Routing for Zero-Trust Low-Altitude Intelligent Networks | Ziye Jia, Sijie He, Ligang Yuan, Fuhui Zhou, Qihui Wu, Zhu Han, Dusit Niyato | 2026-02-27 | 下载 | Due to the scalability and portability, low-altitude intelligent networks (LAINs) are essential in various fields such as surveillance and disaster rescue. |
| Toward E2E Intelligence in 6G Networks: An AI Agent-Based RAN-CN Converged Intelligence Framework | Youbin Han, Haneul Ko, Namseok Ko, Tarik Taleb, Yan Chen | 2026-02-27 | 下载 | Recent advances in intelligent network control have primarily relied on task-specific Artificial Intelligence (AI) models deployed separately within the Radio Access Network (RAN) and Core Network (CN... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| OBASE: Object-Based Address-Space Engineering to Improve Memory Tiering | Vinay Banakar, Suli Yang, Kan Wu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Kimberly Keeton | 2026-02-27 | 下载 | Hardware and OS mechanisms for memory tiering are widely deployed, yet datacenters still overprovision DRAM. The root cause is hotness fragmentation: allocators place objects by size rather than acces... |
| Token Management in Multi-Tenant AI Inference Platforms | William J. Cunningham | 2026-02-27 | 下载 | Multi-tenant AI inference platforms must balance resource utilization against service-level guarantees under variable demand. Conventional approaches fail to achieve this balance: dedicated endpoints ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Vectorized Adaptive Histograms for Sparse Oblique Forests | Ariel Lubonja, Jungsang Yoon, Haoyin Xu, Yue Wan, Yilin Xu, Richard Stotz, Mathieu Guillame-Bert, Joshua T. Vogelstein, Randal Burns | 2026-02-27 | 下载 | Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types. |
| Advanced Scheduling Strategies for Distributed Quantum Computing Jobs | Gongyu Ni, Davide Ferrari, Lester Ho, Michele Amoretti | 2026-02-27 | 下载 | Distributed quantum computing (DQC) is being actively investigated as a means of scaling the number of qubits across multiple connected quantum devices. |
| Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing | Bowen Sun, Christos D. Antonopoulos, Evgenia Smirni, Bin Ren, Nikolaos Bellas, Spyros Lalis | 2026-02-27 | 下载 | Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency requires retaining warm function instances,... |
| QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models | Md Hasanur Rashid, Jesun Firoz, Nathan R. Tallent, Luanzheng Guo, Meng Tang, Dong Dai | 2026-02-27 | 下载 | With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing time or limiting execution to resource s... |