Appearance
2025-11-26
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Platinum: Path-Adaptable LUT-Based Accelerator Tailored for Low-Bit Weight Matrix Multiplication | Haoxuan Shan, Cong Guo, Chiyue Wei, Feng Cheng, Junyao Zhang, Hai "Helen" Li, Yiran Chen | 2025-11-26 | 下载 | The rapid scaling of large language models demands more efficient hardware. Quantization offers a promising trade-off between efficiency and performance. |
| Modeling and Simulation Frameworks for Processing-in-Memory Architectures | Mahdi Aghaei, Saba Ebrahimi, Mohammad Saleh Arafati, Elham Cheshmikhani, Dara Rahmati, Saeid Gorgin, Jungrae Kim | 2025-11-26 | 下载 | Processing-in-Memory (PIM) has emerged as a promising computing paradigm to address the memory wall and the fundamental bottleneck of the von Neumann architecture by reducing costly data movement betw... |
| Modeling and Optimizing Performance Bottlenecks for Neuromorphic Accelerators | Jason Yik, Walter Gallego Gomez, Andrew Cheng, Benedetto Leto, Alessandro Pierro, Noah Pacik-Nelson, Korneel Van den Berghe, Vittorio Fra, Andreea Danielescu, Gianvito Urgese, Vijay Janapa Reddi | 2025-11-26 | 下载 | Neuromorphic accelerators offer promising platforms for machine learning (ML) inference by leveraging event-driven, spatially-expanded architectures that naturally exploit unstructured sparsity throug... |
| A 0.32 mm 100 Mb/s 223 mW ASIC in 22FDX for Joint Jammer Mitigation, Channel Estimation, and SIMO Data Detection | Jonas Elmiger, Fabian Stuber, Oscar Castañeda, Gian Marti, Christoph Studer | 2025-11-26 | 下载 | We present the first single-input multiple-output (SIMO) receiver ASIC that jointly performs jammer mitigation, channel estimation, and data detection. |
| A Jammer-Resilient 2.87 mm 1.28 MS/s 310 mW Multi-Antenna Synchronization ASIC in 65 nm | Flurin Arquint, Oscar Castañeda, Gian Marti, Christoph Studer | 2025-11-26 | 下载 | We present the first ASIC implementation of jammer-resilient multi-antenna time synchronization. The ASIC implements a recent algorithm that mitigates jamming attacks on synchronization signals using ... |
| Bombyx: OpenCilk Compilation for FPGA Hardware Acceleration | Mohamed Shahawy, Julien de Castelnau, Paolo Ienne | 2025-11-26 | 下载 | Task-level parallelism (TLP) is a widely used approach in software where independent tasks are dynamically created and scheduled at runtime. Recent systems have explored architectural support for TLP ... |
| RISC-V Based TinyML Accelerator for Depthwise Separable Convolutions in Edge AI | Muhammed Yildirim, Ozcan Ozturk | 2025-11-26 | 下载 | The increasing demand for on-device intelligence in Edge AI and TinyML applications requires the efficient execution of modern Convolutional Neural Networks (CNNs). |
| LLaMCAT: Optimizing Large Language Model Inference with Cache Arbitration and Throttling | Zhongchun Zhou, Chengtao Lai, Wei Zhang | 2025-11-26 | 下载 | Large Language Models (LLMs) have achieved unprecedented success across various applications, but their substantial memory requirements pose significant challenges to current memory system designs, es... |
| Handling of Memory Page Faults during Virtual-Address RDMA | Antonis Psistakis | 2025-11-26 | 下载 | Nowadays, avoiding system calls during cluster communication (e.g., in Data Centers and High Performance Computing) in modern high-speed interconnection networks has become a necessity, due to the hig... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ZipperChain: Transmuting Trusted Third-Party Services Into Trustless Atomic Broadcast | Matteo Bjornsson, Taylor Hardin, Taylor Heinecke, Marcin Furtak, David L. Millman, Mike P. Wittie | 2025-11-26 | 下载 | Distributed ledger technologies (DLTs) rely on distributed consensus mechanisms to reach agreement over the order of transactions and to provide immutability and availability of transaction data. |
| Clock2Q+: A Simple and Efficient Replacement Algorithm for Metadata Cache in VMware vSAN | Yiyan Zhai, Bintang Dwi Marthen, Sarath Balivada, Vamsi Sudhakar Bojji, Eric Knauft, Jitender Rohilla, Jiaqi Zuo, Quanxing Liu, Maxime Austruy, Wenguang Wang, Juncheng Yang | 2025-11-26 | 下载 | Cache replacement algorithms are critical building blocks of storage systems. This paper examines the characteristics of metadata caches and argues that they inherently exhibit correlated references, ... |
| OOCO: Latency-disaggregated Architecture for Online-Offline Co-locate LLM Serving | Siyu Wu, Zihan Tang, Yuting Zeng, Hui Chen, Guiguang Ding, Tongxuan Liu, Ke Zhang, Hailong Yang | 2025-11-26 | 下载 | Large Language Models (LLMs) are increasingly deployed in both latency-sensitive online services and cost-sensitive offline workloads. Co-locating these workloads on shared serving instances can impro... |
| Equivalence and Separation between Heard-Of and Asynchronous Message-Passing Models | Hagit Attiya, Armando Castañeda, Dhrubajyoti Ghosh, Thomas Nowak | 2025-11-26 | 下载 | We revisit the relationship between two fundamental models of distributed computation: the asynchronous message-passing model with up to crash failures (\operatorname{AMP}_f) and the Heard-Of mo... |
| A Sustainable and Reward Incentivized High-Performance Cluster Computing for Artificial Intelligence: A Novel Bayesian-Time-Decay Trust Mechanism in Blockchain | Murat Yaslioglu | 2025-11-26 | 下载 | In an age where sustainability is of paramount importance, the significance of both high-performance computing and intelligent algorithms cannot be understated. |
| DSD: A Distributed Speculative Decoding Solution for Edge-Cloud Agile Large Model Serving | Fengze Yu, Leshu Li, Brad McDanel, Sai Qian Zhang | 2025-11-26 | 下载 | Large language model (LLM) inference often suffers from high decoding latency and limited scalability across heterogeneous edge-cloud environments. |
| AI/ML Model Cards in Edge AI Cyberinfrastructure: towards Agentic AI | Beth Plale, Neelesh Karthikeyan, Isuru Gamage, Joe Stubbs, Sachith Withana | 2025-11-26 | 下载 | AI/ML model cards can contain a benchmarked evaluation of an AI/ML model against intended use but a one time assessment during model training does not get at how and where a model is actually used ove... |
| Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases | Shahir Abdullah, Syed Rohit Zaman | 2025-11-26 | 下载 | Modern cloud databases present scaling as a binary decision: scale-out by adding nodes or scale-up by increasing per-node resources. This one-dimensional view is limiting because database performance,... |
| MAD-DAG: Protecting Blockchain Consensus from MEV | Roi Bar-Zur, Aviv Tamar, Ittay Eyal | 2025-11-26 | 下载 | Blockchain security is threatened by selfish mining, where a miner (operator) deviates from the protocol to increase their revenue. Selfish mining is exacerbated by adverse conditions: rushing (networ... |
| Modeling the Effect of Data Redundancy on Speedup in MLFMA Near-Field Computation | Morteza Sadeghi | 2025-11-26 | 下载 | The near-field (P2P) operator in the Multilevel Fast Multipole Algorithm (MLFMA) is a performance bottleneck on GPUs due to poor memory locality. |
| MemFine: Memory-Aware Fine-Grained Scheduling for MoE Training | Lu Zhao, Rong Shi, Shaoqing Zhang, Yueqiang Chen, Baoguo He, Hongfeng Sun, Ziqing Yin, Shangchao Su, Zhiyan Cui, Liang Dong, Xiyuan Li, Lingbin Wang, Jianwei He, Jiesong Ma, Weikang Huang, Jianglei Tong, Dongdong Gao, Jian Zhang, Hong Tian, Hui Shen, Zongtai Luo, Zhaoqun Sun, Hongxing Niu, Yue Sun | 2025-11-26 | 下载 | The training of large-scale Mixture of Experts (MoE) models faces a critical memory bottleneck due to severe load imbalance caused by dynamic token routing. |
| Automated Dynamic AI Inference Scaling on HPC-Infrastructure: Integrating Kubernetes, Slurm and vLLM | Tim Trappen, Robert Keßler, Roland Pabel, Viktor Achter, Stefan Wesner | 2025-11-26 | 下载 | Due to rising demands for Artificial Inteligence (AI) inference, especially in higher education, novel solutions utilising existing infrastructure are emerging. |
| GPU-Virt-Bench: A Comprehensive Benchmarking Framework for Software-Based GPU Virtualization Systems | Jithin VG, Ditto PS | 2025-11-26 | 下载 | The proliferation of GPU-accelerated workloads, particularly in artificial intelligence and large language model (LLM) inference, has created unprecedented demand for efficient GPU resource sharing in... |
| Privacy in Federated Learning with Spiking Neural Networks | Dogukan Aksu, Jesus Martinez del Rincon, Ihsen Alouani | 2025-11-26 | 下载 | Spiking neural networks (SNNs) have emerged as prominent candidates for embedded and edge AI. Their inherent low power consumption makes them far more efficient than conventional ANNs in scenarios whe... |
| GPU Memory Prediction for Multimodal Model Training | Jinwoo Jeong, Minchul Kang, Younghun Go, Changyong Shin, Hyunho Lee, Junho Yoon, Gyeongsik Yang, Chuck Yoo | 2025-11-26 | 下载 | As deep learning models in agentic AI systems grow in scale and complexity, GPU memory requirements increase and often exceed the available GPU memory capacity, so that out-of-memory (OoM) errors occu... |
| LLaMCAT: Optimizing Large Language Model Inference with Cache Arbitration and Throttling | Zhongchun Zhou, Chengtao Lai, Wei Zhang | 2025-11-26 | 下载 | Large Language Models (LLMs) have achieved unprecedented success across various applications, but their substantial memory requirements pose significant challenges to current memory system designs, es... |
| Handling of Memory Page Faults during Virtual-Address RDMA | Antonis Psistakis | 2025-11-26 | 下载 | Nowadays, avoiding system calls during cluster communication (e.g., in Data Centers and High Performance Computing) in modern high-speed interconnection networks has become a necessity, due to the hig... |
| Efficient Multi-Adapter LLM Serving via Cross-Model KV-Cache Reuse with Activated LoRA | Allison Li, Kristjan Greenewald, Thomas Parnell, Navid Azizan | 2025-11-26 | 下载 | Modern large language model (LLM) systems increasingly rely on multi-turn pipelines that are composed of multiple task-specific adapters, yet existing serving frameworks remain inefficient, incurring ... |
| DOPD: A Dynamic PD-Disaggregation Architecture for Maximizing Goodput in LLM Inference Serving | Junhan Liao, Minxian Xu, Wanyi Zheng, Yan Wang, Kejiang Ye, Rajkumar Buyya, Chengzhong Xu | 2025-11-26 | 下载 | To meet strict Service-Level Objectives (SLOs),contemporary Large Language Models (LLMs) decouple the prefill and decoding stages and place them on separate GPUs to mitigate the distinct bottlenecks i... |
| Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows | Yinwei Dai, Zhuofu Chen, Anand Iyer, Ravi Netravali | 2025-11-26 | 下载 | Agentic workflows have emerged as a powerful paradigm for solving complex, multi-stage tasks, but serving them at scale is computationally expensive given the many LLM inferences that each request mus... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ZipperChain: Transmuting Trusted Third-Party Services Into Trustless Atomic Broadcast | Matteo Bjornsson, Taylor Hardin, Taylor Heinecke, Marcin Furtak, David L. Millman, Mike P. Wittie | 2025-11-26 | 下载 | Distributed ledger technologies (DLTs) rely on distributed consensus mechanisms to reach agreement over the order of transactions and to provide immutability and availability of transaction data. |
| Resilient and Reliable Cloud Network Control for Mission-Critical Latency-Sensitive Service Chains | Chin-Wei Huang, Jaime Llorca, Antonia M. Tulino, Andreas F. Molisch | 2025-11-26 | 下载 | The proliferation of mission-critical latency-sensitive services has intensified the demand for next-generation cloud-integrated networks to guarantee both reliable and resilient service delivery. |
| Secure Command, Control and Communications Systems (C3) for Army UxVs | T. Rebolo, A. Grilo, C. Ribeiro | 2025-11-26 | 下载 | Unmanned Vehicles (UxVs) are increasingly used in modern military operations for reconnaissance, surveillance, and strike missions, enhancing situational awareness while reducing risk to personnel. |
| Toward Secure Content-Centric Approaches for 5G-Based IoT: Advances and Emerging Trends | Ghada Jaber, Mohamed Ali Zormati, Walid Cavelius, Louka Chapiro, Mohamed El Ahmadi | 2025-11-26 | 下载 | The convergence of the Internet of Things (IoT) and 5G technologies is transforming modern communication systems by enabling massive connectivity, low latency, and high-speed data transmission. |
| ChronoRAN: Analyzing Latency in 5G Systems | Arman Maghsoudnia, Aoyu Gong, Raphael Cannatà, Dan Mihai Dumitriu, Haitham Hassanieh | 2025-11-26 | 下载 | This paper presents ChronoRAN, a mathematical framework for accurately computing one-way latency (for uplink and downlink) in the 5G RAN across diverse system configurations. |
| Digital Twin-Driven Secure Access Strategy for SAGIN-Enabled IoT Networks | Hui Liang, Zhihui Wu, Runqi Yuan, Guobin Zhang, Yanfeng Zhang, Jinkai Zheng, Tom H. Luan | 2025-11-26 | 下载 | In space-air-ground integrated networks (SAGIN)-enabled IoT networks, secure access has become a significant challenge due to the increasing risks of eavesdropping attacks. |
| 5G Network Automation Using Local Large Language Models and Retrieval-Augmented Generation | Ahmadreza Majlesara, Ali Majlesi, Ali Mamaghani, Alireza Shokrani, Babak Hossein Khalaj | 2025-11-26 | 下载 | This demonstration showcases the integration of a lightweight, locally deployed Large Language Model (LLaMA-3 8b Q-4b) empowered by retrieval augmented generation (RAG) to automate 5G network manageme... |
| Performance Evaluation of Low-Latency Live Streaming of MPEG-DASH UHD video over Commercial 5G NSA/SA Network | Kasidis Arunruangsirilert, Bo Wei, Hang Song, Jiro Katto | 2025-11-26 | 下载 | 5G Standalone (SA) is the goal of the 5G evolution, which aims to provide higher throughput and lower latency than the existing LTE network. One of the main applications of 5G is the real-time distrib... |
| Real-World Performance Evaluations of Low-Band 5G NR/4G LTE 4x4 MIMO on Commercial Smartphones | Pasapong Wongprasert, Kasidis Arunruangsirilert, Jiro Katto | 2025-11-26 | 下载 | All 3GPP-compliant commercial 5G New Radio (NR)-capable UEs on the market are equipped with 4x4 MIMO support for Mid-Band frequencies (>1.7 GHz) and above, enabling up to rank 4 MIMO transmission. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| DynamicAdaptiveClimb: Adaptive Cache Replacement with Dynamic Resizing | Daniel Berend, Shlomi Dolev, Sweta Kumari, Dhruv Mishra, Marina Kogan-Sadetsky, Archit Somani | 2025-11-26 | 下载 | Efficient cache management is critical for optimizing the system performance, and numerous caching mechanisms have been proposed, each exploring various insertion and eviction strategies. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Modeling the Effect of Data Redundancy on Speedup in MLFMA Near-Field Computation | Morteza Sadeghi | 2025-11-26 | 下载 | The near-field (P2P) operator in the Multilevel Fast Multipole Algorithm (MLFMA) is a performance bottleneck on GPUs due to poor memory locality. |
| Automated Dynamic AI Inference Scaling on HPC-Infrastructure: Integrating Kubernetes, Slurm and vLLM | Tim Trappen, Robert Keßler, Roland Pabel, Viktor Achter, Stefan Wesner | 2025-11-26 | 下载 | Due to rising demands for Artificial Inteligence (AI) inference, especially in higher education, novel solutions utilising existing infrastructure are emerging. |