Appearance
2025-05-09
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| "vcd2df" -- Leveraging Data Science Insights for Hardware Security Research | Calvin Deutschbein, Jimmy Ostler, Hriday Raj | 2025-05-09 | 下载 | In this work, we hope to expand the universe of security practitioners of open-source hardware by creating a bridge from hardware design languages (HDLs) to data science languages like Python and R th... |
| A Comprehensive Data Description for LoRaWAN Path Loss Measurements in an Indoor Office Setting: Effects of Environmental Factors | Nahshon Mokua Obiri, Kristof Van Laerhoven | 2025-05-09 | 下载 | This paper presents a comprehensive dataset of LoRaWAN technology path loss measurements collected in an indoor office environment, focusing on quantifying the effects of environmental factors on sign... |
| Assessing Tenstorrent's RISC-V MatMul Acceleration Capabilities | Hiari Pizzini Cavagna, Daniele Cesarini, Andrea Bartolini | 2025-05-09 | 下载 | The increasing demand for generative AI as Large Language Models (LLMs) services has driven the need for specialized hardware architectures that optimize computational efficiency and energy consumptio... |
| Taming Offload Overheads in a Massively Parallel Open-Source RISC-V MPSoC: Analysis and Optimization | Luca Colagrande, Luca Benini | 2025-05-09 | 下载 | Heterogeneous multi-core architectures combine on a single chip a few large, general-purpose host cores, optimized for single-thread performance, with (many) clusters of small, specialized, energy-eff... |
| LightNobel: Improving Sequence Length Limitation in Protein Structure Prediction Model via Adaptive Activation Quantization | Seunghee Han, Soongyu Choi, Joo-Young Kim | 2025-05-09 | 下载 | Recent advances in Protein Structure Prediction Models (PPMs), such as AlphaFold2 and ESMFold, have revolutionized computational biology by achieving unprecedented accuracy in predicting three-dimensi... |
| What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips | Renjie Li, Wenjie Wei, Qi Xin, Xiaoli Liu, Sixuan Mao, Erik Ma, Zijian Chen, Malu Zhang, Haizhou Li, Zhaoyu Zhang | 2025-05-09 | 下载 | Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware. For example, training GPT-3 has been estimated to consume around 1300 MWh of electricity, and projection... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Challenging GPU Dominance: When CPUs Outperform for On-Device LLM Inference | Haolin Zhang, Jeff Huang | 2025-05-09 | 下载 | The common assumption in on-device AI is that GPUs, with their superior parallel processing, always provide the best performance for large language model (LLM) inference. |
| On Optimal Batch Size in Coded Computing | Swapnil Saha, Emina Soljanin, Philip Whiting | 2025-05-09 | 下载 | We consider computing systems that partition jobs into tasks, add redundancy through coding, and assign the encoded tasks to different computing nodes for parallel execution. |
| Distributed Tensor Network Library for Quantum Computing Emulation | Jakub Adamski, Oliver Thomson Brown | 2025-05-09 | 下载 | Tensor networks offer an adaptable and efficient approach to emulation of quantum computers. Their usage relies on partitioning circuits into small tensors, which are contracted together to form the f... |
| HashKitty: Distributed Password Analysis | Pedro Antunes, Tomás Santos, Daniel Fuentes, Luís Frazão | 2025-05-09 | 下载 | This article documents the HashKitty platform, a distributed solution for password analysis based on the hashcat tool, designed to improve efficiency in both offensive and defensive security operation... |
| Scheduled Jacobian Chaining | Simon Märtens, Uwe Naumann | 2025-05-09 | 下载 | This paper addresses the efficient computation of Jacobian matrices for programs composed of sequential differentiable subprograms. By representing the overall Jacobian as a chain product of the Jacob... |
| Efficient Information Updates in Compute-First Networking via Reinforcement Learning with Joint AoI and VoI | Jianpeng Qi, Chao Liu, Chengxiang Xu, Rui Wang, Junyu Dong, Yanwei Yu | 2025-05-09 | 下载 | Timely and efficient dissemination of service information is critical in compute-first networking systems, where user requests arrive dynamically and computing resources are constrained. |
| Toward Heterogeneous, Distributed, and Energy-Efficient Computing with SYCL | Biagio Cosenza, Lorenzo Carpentieri, Kaijie Fan, Marco D'Antonio, Peter Thoman, Philip Salzmann | 2025-05-09 | 下载 | Programming modern high-performance computing systems is challenging due to the need to efficiently program GPUs and accelerators and to handle data movement between nodes. |
| An Autonomy Loop for Dynamic HPC Job Time Limit Adjustment | Thomas Jakobsche, Osman Seckin Simsek, Jim Brandt, Ann Gentile, Florina M. Ciorba | 2025-05-09 | 下载 | High Performance Computing (HPC) systems rely on fixed user-provided estimates of job time limits. These estimates are often inaccurate, resulting in inefficient resource use and the loss of unsaved w... |
| Taming Offload Overheads in a Massively Parallel Open-Source RISC-V MPSoC: Analysis and Optimization | Luca Colagrande, Luca Benini | 2025-05-09 | 下载 | Heterogeneous multi-core architectures combine on a single chip a few large, general-purpose host cores, optimized for single-thread performance, with (many) clusters of small, specialized, energy-eff... |
| DawnPiper: A Memory-scablable Pipeline Parallel Training Framework | Xuan Peng, Xuanhua Shi, Haolin Zhang, Yunfei Zhao, Xuehai Qian | 2025-05-09 | 下载 | Pipeline parallelism is a crucial paradigm for large-scale model training. However, imbalances in memory footprint across stages can lead to significant GPU memory wastage, limiting the model sizes th... |
| All-to-All Communication with Mobile Edge Adversary: Almost Linearly More Faults, For Free | Orr Fischer, Merav Parter | 2025-05-09 | 下载 | Resilient computation in all-to-all-communication models has attracted tremendous attention over the years. Most of these works assume the classical faulty model which restricts the total number of co... |
| Understanding Stragglers in Large Model Training Using What-if Analysis | Jinkun Lin, Ziheng Jiang, Zuquan Song, Sida Zhao, Menghan Yu, Zhanghan Wang, Chenyuan Wang, Zuocheng Shi, Xiang Shi, Wei Jia, Zherui Liu, Shuguang Wang, Haibin Lin, Xin Liu, Aurojit Panda, Jinyang Li | 2025-05-09 | 下载 | Large language model (LLM) training is one of the most demanding distributed computations today, often requiring thousands of GPUs with frequent synchronization across machines. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Revenue Optimization in Video Caching Networks with Privacy-Preserving Demand Predictions | Yijing Zhang, Ferdous Pervej, Andreas F. Molisch | 2025-05-09 | 下载 | Performance of video streaming, which accounts for most of the traffic in wireless communication, can be significantly improved by caching popular videos at the wireless edge. |
| A Comprehensive Data Description for LoRaWAN Path Loss Measurements in an Indoor Office Setting: Effects of Environmental Factors | Nahshon Mokua Obiri, Kristof Van Laerhoven | 2025-05-09 | 下载 | This paper presents a comprehensive dataset of LoRaWAN technology path loss measurements collected in an indoor office environment, focusing on quantifying the effects of environmental factors on sign... |
| Extending the Control Plane of Container Orchestrators for I/O Virtualization | Garegin Grigoryan, Minseok Kwon, M. Mustafa Rafique | 2025-05-09 | 下载 | Single Root Input/Output Virtualization (SR-IOV) is a standard technology for forking a single PCI express device and providing it to applications while ensuring performance isolation. |
| Efficient Information Updates in Compute-First Networking via Reinforcement Learning with Joint AoI and VoI | Jianpeng Qi, Chao Liu, Chengxiang Xu, Rui Wang, Junyu Dong, Yanwei Yu | 2025-05-09 | 下载 | Timely and efficient dissemination of service information is critical in compute-first networking systems, where user requests arrive dynamically and computing resources are constrained. |
| P4Kube: In-Network Load Balancer for Kubernetes | Garegin Grigoryan, Kevin Penkowski, Minseok Kwon | 2025-05-09 | 下载 | Kubernetes Services such as LoadBalancer and NodePort expose applications running on pods within a Kubernetes cluster to external users. While the LoadBalancer Service requires an external load-balanc... |
| Learning Power Control Protocol for In-Factory 6G Subnetworks | Uyoata E. Uyoata, Gilberto Berardinelli, Ramoni Adeogun | 2025-05-09 | 下载 | In-X Subnetworks are envisioned to meet the stringent demands of short-range communication in diverse 6G use cases. In the context of In-Factory scenarios, effective power control is critical to mitig... |
| Multi-User Beamforming with Deep Reinforcement Learning in Sensing-Aided Communication | Xiyu Wang, Gilberto Berardinelli, Hei Victor Cheng, Petar Popovski, Ramoni Adeogun | 2025-05-09 | 下载 | Mobile users are prone to experience beam failure due to beam drifting in millimeter wave (mmWave) communications. Sensing can help alleviate beam drifting with timely beam changes and low overhead si... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| On Optimal Batch Size in Coded Computing | Swapnil Saha, Emina Soljanin, Philip Whiting | 2025-05-09 | 下载 | We consider computing systems that partition jobs into tasks, add redundancy through coding, and assign the encoded tasks to different computing nodes for parallel execution. |
| Assessing Tenstorrent's RISC-V MatMul Acceleration Capabilities | Hiari Pizzini Cavagna, Daniele Cesarini, Andrea Bartolini | 2025-05-09 | 下载 | The increasing demand for generative AI as Large Language Models (LLMs) services has driven the need for specialized hardware architectures that optimize computational efficiency and energy consumptio... |