Skip to content

2025-08-05

cs.AR - Architecture

标题作者发布日期PDF摘要
TROOP: At-the-Roofline Performance for Vector Processors on Low Operational Intensity WorkloadsNavaneeth Kunhi Purayil, Diyou Shen, Matteo Perotti, Luca Benini2025-08-05下载The fast evolution of Machine Learning (ML) models requires flexible and efficient hardware solutions as hardwired accelerators face rapid obsolescence.
Flexible In-NAND Cryptographic Processing for Secure Flash StorageSeock-Hwan Noh, Hoyeon Lee, Junkyum Kim, Junsu Im, Jay H. Park, Sungjin Lee, Sam H. Noh, Yeseong Kim, Jaeha Kung2025-08-05下载We present FlashVault, an in-NAND self-encryption architecture that embeds a reconfigurable cryptographic engine into the unused silicon area of a state-of-the-art 4D V-NAND structure.
Rhea: a Framework for Fast Design and Validation of RTL Cache-Coherent Memory SubsystemsDavide Zoni, Andrea Galimberti, Adriano Guarisco2025-08-05下载Designing and validating efficient cache-coherent memory subsystems is a critical yet complex task in the development of modern multi-core system-on-chip architectures.
Towards Memory Specialization: A Case for Long-Term and Short-Term RAMPeijing Li, Muhammad Shahir Abdurraman, Rachel Cleaveland, Sergey Legtchenko, Philip Levis, Ioan Stefanovici, Thierry Tambe, David Tennenhouse, Caroline Trippel2025-08-05下载Both SRAM and DRAM have stopped scaling: there is no technical roadmap to reduce their cost (per byte/GB). As a result, memory now dominates system cost.
Mamba-X: An End-to-End Vision Mamba Accelerator for Edge Computing DevicesDongho Yoon, Gungyu Lee, Jaewon Chang, Yunjae Lee, Dongjae Lee, Minsoo Rhu2025-08-05下载Transformers have proven effective in language modeling but are limited by high computational and memory demands that grow quadratically with input sequence length.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Intelligent Sampling of Extreme-Scale Turbulence Datasets for Accurate and Efficient Spatiotemporal Model TrainingWesley Brewer, Murali Meena Gopalakrishnan, Matthias Maiterth, Aditya Kashi, Jong Youl Choi, Pei Zhang, Stephen Nichols, Riccardo Balin, Miles Couchman, Stephen de Bruyn Kops, P. K. Yeung, Daniel Dotson, Rohini Uma-Vaideswaran, Sarp Oral, Feiyi Wang2025-08-05下载With the end of Moore's law and Dennard scaling, efficient training increasingly requires rethinking data volume. Can we train better models with significantly less data via intelligent subsampling? T...
Two-dimensional Sparse Parallelism for Large Scale Deep Learning Recommendation Model TrainingXin Zhang, Quanyu Zhu, Liangbei Xu, Zain Huda, Wang Zhou, Jin Fang, Dennis van der Staay, Yuxi Hu, Jade Nie, Jiyan Yang, Chunzhi Yang2025-08-05下载The increasing complexity of deep learning recommendation models (DLRM) has led to a growing need for large-scale distributed systems that can efficiently train vast amounts of data.
Block: Balancing Load in LLM Serving with Context, Knowledge and Predictive SchedulingWei Da, Evangelia Kalyvianaki2025-08-05下载This paper presents Block, a distributed scheduling framework designed to optimize load balancing and auto-provisioning across instances in large language model serving frameworks by leveraging contex...
In-Memory Non-Binary LDPC DecodingOscar Ferraz, Vitor Silva, Gabriel Falcao2025-08-05下载Low-density parity-check (LDPC) codes are an important feature of several communication and storage applications, offering a flexible and effective method for error correction.
Understanding the Landscape of Ampere GPU Memory ErrorsZhu Zhu, Yu Sun, Dhatri Parakal, Bo Fang, Steven Farrell, Gregory H. Bauer, Brett Bode, Ian T. Foster, Michael E. Papka, William Gropp, Zhao Zhang, Lishan Yang2025-08-05下载Graphics Processing Units (GPUs) have become a de facto solution for accelerating high-performance computing (HPC) applications. Understanding their memory error behavior is an essential step toward a...
Optimal Simultaneous Byzantine Agreement, Common Knowledge and Limited Information ExchangeRon van der Meyden2025-08-05下载In order to develop solutions that perform actions as early as possible, analysis of distributed algorithms using epistemic logic has generally concentrated on ``full information protocols'', which ma...
Directives for Function Offloading in 5G Networks Based on a Performance Characteristics AnalysisFalk Dettinger, Matthias Weiß, Daniel Baumann, Martin Sommer, Michael Weyrich2025-08-05下载Cloud-based offloading helps address energy consumption and performance challenges in executing resource-intensive vehicle algorithms. Utilizing 5G, with its low latency and high bandwidth, enables se...
Frontier: Simulating the Next Generation of LLM Inference SystemsYicheng Feng, Xin Tan, Kin Hang Sew, Yimin Jiang, Yibo Zhu, Hong Xu2025-08-05下载Large Language Model (LLM) inference is growing increasingly complex with the rise of Mixture-of-Experts (MoE) models and disaggregated architectures that decouple components like prefill/decode (PD) ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Confidence Driven Classification of Application Types in the Presence of Background Network TrafficEun Hun Choi, Jasleen Kaur, Vladas Pipiras, Nelson Gomes Rodrigues Antunes, Brendan Massey2025-08-05下载Accurately classifying the application types of network traffic using deep learning models has recently gained popularity. However, we find that these classifiers do not perform well on real-world tra...
Data-Driven Spectrum Demand Prediction: A Spatio-Temporal Framework with Transfer LearningAmin Farajzadeh, Hongzhao Zheng, Sarah Dumoulin, Trevor Ha, Halim Yanikomeroglu, Amir Ghasemi2025-08-05下载Accurate spectrum demand prediction is crucial for informed spectrum allocation, effective regulatory planning, and fostering sustainable growth in modern wireless communication networks.
CASH: Context-Aware Smart Handover for Reliable UAV Connectivity on Aerial CorridorsAbdul Saboor, Zhuangzhuang Cui, Achiel Colpaert, Evgenii Vinogradov, Sofie Pollin2025-08-05下载Urban Air Mobility (UAM) envisions aerial corridors for Unmanned Aerial Vehicles (UAVs) to reduce ground traffic congestion by supporting 3D mobility, such as air taxis.
What If, But Privately: Private Counterfactual RetrievalShreya Meel, Mohamed Nomeir, Pasan Dissanayake, Sanghamitra Dutta, Sennur Ulukus2025-08-05下载Transparency and explainability are two important aspects to be considered when employing black-box machine learning models in high-stake applications.
Decoding and Engineering the Phytobiome Communication for Smart AgricultureFatih Gulec, Hamdan Awan, Nigel Wallbridge, Andrew W. Eckford2025-08-05下载Smart agriculture applications, integrating technologies like the Internet of Things and machine learning/artificial intelligence (ML/AI) into agriculture, hold promise to address modern challenges of...
Heterogeneity-Oblivious Robust Federated LearningWeiyao Zhang, Jinyang Li, Qi Song, Miao Wang, Chungang Lin, Haitong Luo, Xuying Meng, Yujun Zhang2025-08-05下载Federated Learning (FL) remains highly vulnerable to poisoning attacks, especially under real-world hyper-heterogeneity, where clients differ significantly in data distributions, communication capabil...
Agoran: An Agentic Open Marketplace for 6G RAN AutomationIlias Chatzistefanidis, Navid Nikaein, Andrea Leone, Ali Maatouk, Leandros Tassiulas, Roberto Morabito, Ioannis Pitsiorlas, Marios Kountouris2025-08-05下载Next-generation mobile networks must reconcile the often-conflicting goals of multiple service owners. However, today's network slice controllers remain rigid, policy-bound, and unaware of the busines...
Bidirectional TLS Handshake Caching for Constrained Industrial IoT ScenariosJörn Bodenhausen, Simon Mangel, Thomas Vogt, Martin Henze2025-08-05下载While TLS has become the de-facto standard for end-to-end security, its use to secure critical communication in evolving industrial IoT scenarios is severely limited by prevalent resource constraints ...
Directives for Function Offloading in 5G Networks Based on a Performance Characteristics AnalysisFalk Dettinger, Matthias Weiß, Daniel Baumann, Martin Sommer, Michael Weyrich2025-08-05下载Cloud-based offloading helps address energy consumption and performance challenges in executing resource-intensive vehicle algorithms. Utilizing 5G, with its low latency and high bandwidth, enables se...
Energy-efficient Federated Learning for UAV CommunicationsChien-Wei Fu, Meng-Lin Ku2025-08-05下载In this paper, we propose an unmanned aerial vehicle (UAV)-assisted federated learning (FL) framework that jointly optimizes UAV trajectory, user participation, power allocation, and data volume contr...
Scalability and Performance Evaluation of IEEE 802.11ah IoT Deployments: A Testbed ApproachKostas Chounos, Katerina Kyriakou, Thanasis Korakis2025-08-05下载This work focuses on the development and assessment of modern wireless Internet of Things (IoT) architectures, with relevance to emerging 5G and beyond applications.
NANDA Adaptive Resolver: Architecture for Dynamic Resolution of AI Agent NamesJohn Zinky, Hema Seshadri, Mahesh Lambe, Pradyumna Chari, Ramesh Raskar2025-08-05下载AdaptiveResolver is a dynamic microservice architecture designed to address the limitations of static endpoint resolution for AI agent communication in distributed, heterogeneous environments.
Using the NANDA Index Architecture in Practice: An Enterprise PerspectiveSichao Wang, Ramesh Raskar, Mahesh Lambe, Pradyumna Chari, Rekha Singhal, Shailja Gupta, Rajesh Ranjan, Ken Huang2025-08-05下载The proliferation of autonomous AI agents represents a paradigmatic shift from traditional web architectures toward collaborative intelligent systems requiring sophisticated mechanisms for discovery, ...
Evolution of AI Agent Registry Solutions: Centralized, Enterprise, and Distributed ApproachesAditi Singh, Abul Ehtesham, Mahesh Lambe, Jared James Grogan, Abhishek Singh, Saket Kumar, Luca Muscariello, Vijoy Pandey, Guillaume Sauvage De Saint Marc, Pradyumna Chari, Ramesh Raskar2025-08-05下载Autonomous AI agents now operate across cloud, enterprise, and decentralized domains, creating demand for registry infrastructures that enable trustworthy discovery, capability negotiation, and identi...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
RX-INT: A Kernel Engine for Real-Time Detection and Analysis of In-Memory ThreatsArjun Juneja2025-08-05下载Malware and cheat developers use fileless execution techniques to evade traditional, signature-based security products. These methods include various types of manual mapping, module stomping, and thre...
MaLV-OS: Rethinking the Operating System Architecture for Machine Learning in Virtualized CloudsStella Bitchebe, Oana Balmau2025-08-05下载A large body of research has employed Machine Learning (ML) models to develop learned operating systems (OSes) and kernels. The latter dynamically adapts to the job load and dynamically adjusts resour...

cs.PF - Performance

标题作者发布日期PDF摘要
A Novel Hybrid Optical and STAR IRS System for NTN CommunicationsShunyuan Shang, Emna Zedini, Abla Kammoun, Mohamed-Slim Alouini2025-08-05下载This paper proposes a novel non-terrestrial networks (NTNs) system that integrates optical intelligent reflecting surfaces (OIRS) and simultaneous transmitting and reflecting Intelligent reflecting su...

基于 VitePress 构建