Appearance
2025-12-10
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Vertically Integrated Framework for Templatized Chip Design | Jeongeun Kim, Christopher Torng | 2025-12-10 | 下载 | Developers who primarily engage with software often struggle to incorporate custom hardware into their applications, even though specialized silicon can provide substantial benefits to machine learnin... |
| Algorithm-Driven On-Chip Integration for High Density and Low Cost | Jeongeun Kim, Sabrina Yarzada, Paul Chen, Christopher Torng | 2025-12-10 | 下载 | Growing interest in semiconductor workforce development has generated demand for platforms capable of supporting large numbers of independent hardware designs for research and training without imposin... |
| Pinball: A Cryogenic Predecoder for Surface Code Decoding Under Circuit-Level Noise | Alexander Knapen, Guanchen Tao, Jacob Mack, Tomas Bruno, Mehdi Saligane, Dennis Sylvester, Qirui Zhang, Gokul Subramanian Ravi | 2025-12-10 | 下载 | Scaling fault tolerant quantum computers, especially cryogenic systems based on the surface code, to millions of qubits is challenging due to poorly-scaling data processing and power consumption overh... |
| ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators | Guoqiang Zou, Wanyu Wang, Hao Zheng, Longxiang Yin, Yinhe Han | 2025-12-10 | 下载 | Existing memory management techniques severely hinder efficient Large Language Model serving on accelerators constrained by poor random-access bandwidth. |
| RACAM: Enhancing DRAM with Reuse-Aware Computation and Automated Mapping for ML Inference | Siyuan Ma, Jiajun Hu, Jeeho Ryoo, Aman Arora, Lizy Kurian John | 2025-12-10 | 下载 | In-DRAM Processing-In-Memory (DRAM-PIM) has emerged as a promising approach to accelerate memory-intensive workloads by mitigating data transfer overhead between DRAM and the host processor. |
| Efficient MoE Serving in the Memory-Bound Regime: Balance Activated Experts, Not Tokens | Yanpeng Yu, Haiyue Ma, Krish Agarwal, Nicolai Oswald, Qijing Huang, Hugo Linsenmaier, Chunhui Mei, Ritchie Zhao, Ritika Borkar, Bita Darvish Rouhani, David Nellans, Ronny Krashinsky, Anurag Khandelwal | 2025-12-10 | 下载 | Expert Parallelism (EP) permits Mixture of Experts (MoE) models to scale beyond a single GPU. To address load imbalance across GPUs in EP, existing approaches aim to balance the number of tokens each ... |
| Tensor-Compressed and Fully-Quantized Training of Neural PDE Solvers | Jinming Lu, Jiayi Tian, Yequan Zhao, Hai Li, Zheng Zhang | 2025-12-10 | 下载 | Physics-Informed Neural Networks (PINNs) have emerged as a promising paradigm for solving partial differential equations (PDEs) by embedding physical laws into neural network training objectives. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Comparative Analysis of zk-SNARKs and zk-STARKs: Theory and Practice | Ayush Nainwal, Atharva Kamble, Nitin Awathare | 2025-12-10 | 下载 | Zero-knowledge proofs (ZKPs) are central to secure and privacy-preserving computation, with zk-SNARKs and zk-STARKs emerging as leading frameworks offering distinct trade-offs in efficiency, scalabili... |
| Link-Sharing Backpressure Routing In Wireless Multi-Hop Networks | Zhongyuan Zhao, Yujun Ming, Ananthram Swami, Kevin Chan, Fikadu Dagefu, Santiago Segarra | 2025-12-10 | 下载 | Backpressure (BP) routing and scheduling is an established resource allocation method for wireless multi-hop networks, noted for its fully distributed operation and maximum queue stability. |
| Ariel-ML: Computing Parallelization with Embedded Rust for Neural Networks on Heterogeneous Multi-core Microcontrollers | Zhaolan Huang, Kaspar Schleiser, Gyungmin Myung, Emmanuel Baccelli | 2025-12-10 | 下载 | Low-power microcontroller (MCU) hardware is currently evolving from single-core architectures to predominantly multi-core architectures. In parallel, new embedded software building blocks are more and... |
| Recoverable Lock-Free Locks | Hagit Attiya, Panagiota Fatourou, Eleftherios Kosmas, Yuanhao Wei | 2025-12-10 | 下载 | This paper presents the first transformation that introduces both lock-freedom and recoverability. Our transformation starts with a lock-based implementation, and provides a recoverable, lock-free sub... |
| Straggler Tolerant and Resilient DL Training on Homogeneous GPUs | Zeyu Zhang, Haiying Shen | 2025-12-10 | 下载 | Despite the popularity of homogeneous GPU-based deep learning (DL) training, the prevalence, causes and impact of stragglers and the effectiveness of existing straggler mitigation approaches are still... |
| SynthPix: A lightspeed PIV images generator | Antonio Terpin, Alan Bonomi, Francesco Banelli, Raffaello D'Andrea | 2025-12-10 | 下载 | We describe SynthPix, a synthetic image generator for Particle Image Velocimetry (PIV) with a focus on performance and parallelism on accelerators, implemented in JAX. |
| PHWSOA: A Pareto-based Hybrid Whale-Seagull Scheduling for Multi-Objective Tasks in Cloud Computing | Zhi Zhao, Hang Xiao, Wei Rang | 2025-12-10 | 下载 | Task scheduling is a critical research challenge in cloud computing, a transformative technology widely adopted across industries. Although numerous scheduling solutions exist, they predominantly opti... |
| Scalable Construction of Spiking Neural Networks using up to thousands of GPUs | Bruno Golosio, Gianmarco Tiddia, José Villamar, Luca Pontisso, Luca Sergi, Francesco Simula, Pooja Babu, Elena Pastorelli, Abigail Morrison, Markus Diesmann, Alessandro Lonardo, Pier Stanislao Paolucci, Johanna Senk | 2025-12-10 | 下载 | Diverse scientific and engineering research areas deal with discrete, time-stamped changes in large systems of interacting delay differential equations. |
| WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving | Chiheng Lou, Sheng Qi, Rui Kang, Yong Zhang, Chen Sun, Pengcheng Wang, Bingyang Liu, Xuanzhe Liu, Xin Jin | 2025-12-10 | 下载 | Deploying multiple models within shared GPU clusters is promising for improving resource efficiency in large language model (LLM) serving. Existing multi-LLM serving systems optimize GPU utilization a... |
| PerCache: Predictive Hierarchical Cache for RAG Applications on Mobile Devices | Kaiwei Liu, Liekang Zeng, Lilin Xu, Bufang Yang, Zhenyu Yan | 2025-12-10 | 下载 | Retrieval-augmented generation (RAG) has been extensively used as a de facto paradigm in various large language model (LLM)-driven applications on mobile devices, such as mobile assistants leveraging ... |
| Passing the Baton: High Throughput Distributed Disk-Based Vector Search with BatANN | Nam Anh Dang, Ben Landrum, Ken Birman | 2025-12-10 | 下载 | Vector search underpins modern information-retrieval systems, including retrieval-augmented generation (RAG) pipelines and search engines over unstructured text and images. |
| A Distributed Framework for Privacy-Enhanced Vision Transformers on the Edge | Zihao Ding, Mufeng Zhu, Zhongze Tang, Sheng Wei, Yao Liu | 2025-12-10 | 下载 | Nowadays, visual intelligence tools have become ubiquitous, offering all kinds of convenience and possibilities. However, these tools have high computational requirements that exceed the capabilities ... |
| SHARe-KAN: Holographic Vector Quantization for Memory-Bound Inference | Jeff Smith | 2025-12-10 | 下载 | Kolmogorov-Arnold Networks (KANs) face a fundamental memory wall: their learned basis functions create parameter counts that impose extreme bandwidth demands, hindering deployment in memory-constraine... |
| GoodSpeed: Optimizing Fair Goodput with Adaptive Speculative Decoding in Distributed Edge Inference | Phuong Tran, Tzu-Hao Liu, Long Tan Le, Tung-Anh Nguyen, Van Quan La, Eason Yu, Han Shu, Choong Seon Hong, Nguyen H. Tran | 2025-12-10 | 下载 | Large language models (LLMs) have revolutionized natural language processing, yet their high computational demands pose significant challenges for real-time inference, especially in multi-user server ... |
| Efficient MoE Serving in the Memory-Bound Regime: Balance Activated Experts, Not Tokens | Yanpeng Yu, Haiyue Ma, Krish Agarwal, Nicolai Oswald, Qijing Huang, Hugo Linsenmaier, Chunhui Mei, Ritchie Zhao, Ritika Borkar, Bita Darvish Rouhani, David Nellans, Ronny Krashinsky, Anurag Khandelwal | 2025-12-10 | 下载 | Expert Parallelism (EP) permits Mixture of Experts (MoE) models to scale beyond a single GPU. To address load imbalance across GPUs in EP, existing approaches aim to balance the number of tokens each ... |
| TDC-Cache: A Trustworthy Decentralized Cooperative Caching Framework for Web3.0 | Jinyu Chen, Long Shi, Taotao Wang, Jiaheng Wang, Wei Zhang | 2025-12-10 | 下载 | The rapid growth of Web3.0 is transforming the Internet from a centralized structure to decentralized, which empowers users with unprecedented self-sovereignty over their own data. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Lightweight Security for Private Networks: Real-World Evaluation of WireGuard | Hubert Djuitcheu, Andrew Sergeev, Khurshid Alam, Danny Santhosh, Achim Autenrieth, Jochen Seitz | 2025-12-10 | 下载 | This paper explores WireGuard as a lightweight alternative to IPsec for securing the user plane as well as the control plane in an industrial Open RAN deployment at the Adtran Terafactory in Meiningen... |
| Network Traffic Analysis with Process Mining: The UPSIDE Case Study | Francesco Vitale, Paolo Palmiero, Massimiliano Rak, Nicola Mazzocca | 2025-12-10 | 下载 | Online gaming is a popular activity involving the adoption of complex systems and network infrastructures. The relevance of gaming, which generates large amounts of market revenue, drove research in m... |
| Link-Sharing Backpressure Routing In Wireless Multi-Hop Networks | Zhongyuan Zhao, Yujun Ming, Ananthram Swami, Kevin Chan, Fikadu Dagefu, Santiago Segarra | 2025-12-10 | 下载 | Backpressure (BP) routing and scheduling is an established resource allocation method for wireless multi-hop networks, noted for its fully distributed operation and maximum queue stability. |
| Towards Practical and Usable In-network Classification | Di Zhu, Jianxi Chen, Hyojoon Kim | 2025-12-10 | 下载 | In-network machine learning enables real-time classification directly on network hardware, offering consistently low inference latency. However, current solutions are limited by strict hardware constr... |
| M3Net: A Multi-Metric Mixture of Experts Network Digital Twin with Graph Neural Networks | Blessed Guda, Carlee Joe-Wong | 2025-12-10 | 下载 | The rise of 5G/6G network technologies promises to enable applications like autonomous vehicles and virtual reality, resulting in a significant increase in connected devices and necessarily complicati... |
| Graph-Based Bayesian Optimization for Quantum Circuit Architecture Search with Uncertainty Calibrated Surrogates | Prashant Kumar Choudhary, Nouhaila Innan, Muhammad Shafique, Rajeev Singh | 2025-12-10 | 下载 | Quantum circuit design is a key bottleneck for practical quantum machine learning on complex, real-world data. We present an automated framework that discovers and refines variational quantum circuits... |
| BlockFLEX: An Adaptive and Survivable Architecture with Hierarchical Routing for LEO Satellite Networks | Xiangtong Wang | 2025-12-10 | 下载 | This paper presents \textbf{BlockFLEX}, an adaptive and survivable architecture with a hierarchical routing scheme for Low Earth Orbit satellite networks, designed to address dynamic topology changes ... |
| Eunomia: A Multicontroller Domain Partitioning Framework in Hierarchical Satellite Network | Qi Zhang, Kun Qiu, Zhe Chen, Wenjun Zhu, Xiaofan Xu, Ping Du, Yue Gao | 2025-12-10 | 下载 | With the rise of mega-satellite constellations, the integration of hierarchical non-terrestrial and terrestrial networks has become a cornerstone of 6G coverage enhancements. |
| Tyche: A Hybrid Computation Framework of Illumination Pattern for Satellite Beam Hopping | Ziheng Yang, Kun Qiu, Zhe Chen, Wenjun Zhu, Yue Gao | 2025-12-10 | 下载 | High-Throughput Satellites (HTS) use beam hopping to handle non-uniform and time-varying ground traffic demand. A significant technical challenge in beam hopping is the computation of effective illumi... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ZeroOS: A Universal Modular Library OS for zkVMs | Guangxian Zou, Isaac Zhang, Ryan Zarick, Kelvin Wong, Thomas Kim, Daniel L. -K. Wong, Saeid Yazdinejad, Dan Boneh | 2025-12-10 | 下载 | zkVMs promise general-purpose verifiable computation through ISA-level compatibility with modern programs and toolchains. However, compatibility extends further than just the ISA; modern programs ofte... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Ariel-ML: Computing Parallelization with Embedded Rust for Neural Networks on Heterogeneous Multi-core Microcontrollers | Zhaolan Huang, Kaspar Schleiser, Gyungmin Myung, Emmanuel Baccelli | 2025-12-10 | 下载 | Low-power microcontroller (MCU) hardware is currently evolving from single-core architectures to predominantly multi-core architectures. In parallel, new embedded software building blocks are more and... |
| TinyDéjàVu: Smaller Memory Footprint & Faster Inference on Sensor Data Streams with Always-On Microcontrollers | Zhaolan Huang, Emmanuel Baccelli | 2025-12-10 | 下载 | Always-on sensors are increasingly expected to embark a variety of tiny neural networks and to continuously perform inference on time-series of the data they sense. |