Appearance
2025-09-24
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Experience Deploying Containerized GenAI Services at an HPC Center | Angel M. Beltre, Jeff Ogden, Kevin Pedretti | 2025-09-24 | 下载 | Generative Artificial Intelligence (GenAI) applications are built from specialized components -- inference servers, object storage, vector and graph databases, and user interfaces -- interconnected vi... |
| ZynqParrot: A Scale-Down Approach to Cycle-Accurate, FPGA-Accelerated Co-Emulation | Daniel Ruelas-Petrisko, Farzam Gilani, Anoop Mysore Nataraja, Zoe Taylor, Michael Taylor | 2025-09-24 | 下载 | As processors increase in complexity, costs grow even more rapidly, both for functional verification and performance validation. Most often, silicon characterizations comprise simple performance count... |
| SoCks - Simplifying Firmware and Software Integration for Heterogeneous SoCs | Marvin Fuchs, Lukas Scheller, Timo Muscheid, Oliver Sander, Luis E. Ardila-Perez | 2025-09-24 | 下载 | Modern heterogeneous System-on-Chip (SoC) devices integrate advanced components into a single package, offering powerful capabilities while also introducing significant complexity. |
| Pedagogically Motivated and Composable Open-Source RISC-V Processors for Computer Science Education | Ian McDougall, Harish Batchu, Michael Davies, Karthikeyan Sankaralingam | 2025-09-24 | 下载 | While most instruction set architectures (ISAs) are only available to use through the purchase of a restrictive commercial license, the RISC-V ISA presents a free and open-source alternative. |
| Design Insights and Comparative Evaluation of a Hardware-Based Cooperative Perception Architecture for Lane Change Prediction | Mohamed Manzour, Catherine M. Elias, Omar M. Shehata, Rubén Izquierdo, Miguel Ángel Sotelo | 2025-09-24 | 下载 | Research on lane change prediction has gained attention in the last few years. Most existing works in this area have been conducted in simulation environments or with pre-recorded datasets, these work... |
| The Cream Rises to the Top: Efficient Reranking Method for Verilog Code Generation | Guang Yang, Wei Zheng, Xiang Chen, Yifan Sun, Fengji Zhang, Terry Yue Zhuo | 2025-09-24 | 下载 | LLMs face significant challenges in Verilog generation due to limited domain-specific knowledge. While sampling techniques improve pass@k metrics, hardware engineers need one trustworthy solution rath... |
| Automated Multi-Agent Workflows for RTL Design | Amulya Bhattaram, Janani Ramamoorthy, Ranit Gupta, Diana Marculescu, Dimitrios Stamoulis | 2025-09-24 | 下载 | The rise of agentic AI workflows unlocks novel opportunities for computer systems design and optimization. However, for specialized domains such as program synthesis, the relative scarcity of HDL and ... |
| Digital Signal Processing from Classical Coherent Systems to Continuous-Variable QKD: A Review of Cross-Domain Techniques, Applications, and Challenges | Davi Juvêncio Gomes de Sousa, Caroline da Silva Morais Alves, Valéria Loureiro da Silva, Nelson Alves Ferreira Neto | 2025-09-24 | 下载 | This systematic review investigates the application of digital signal processing (DSP) techniques -- originally developed for coherent optical communication systems to continuous-variable quantum key ... |
| OpenGL GPU-Based Rowhammer Attack (Work in Progress) | Antoine Plin, Frédéric Fauberteau, Nga Nguyen | 2025-09-24 | 下载 | Rowhammer attacks have emerged as a significant threat to modern DRAM-based memory systems, leveraging frequent memory accesses to induce bit flips in adjacent memory cells. |
| SpecMamba: Accelerating Mamba Inference on FPGA with Speculative Decoding | Linfeng Zhong, Songqiang Xu, Huifeng Wen, Tong Xie, Qingyu Guo, Yuan Wang, Meng Li | 2025-09-24 | 下载 | The growing demand for efficient long-sequence modeling on edge devices has propelled widespread adoption of State Space Models (SSMs) like Mamba, due to their superior computational efficiency and sc... |
| Open-source Stand-Alone Versatile Tensor Accelerator | Anthony Faure-Gignoux, Kevin Delmas, Adrien Gauffriau, Claire Pagetti | 2025-09-24 | 下载 | Machine Learning (ML) applications demand significant computational resources, posing challenges for safety-critical domains like aeronautics. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Experience Deploying Containerized GenAI Services at an HPC Center | Angel M. Beltre, Jeff Ogden, Kevin Pedretti | 2025-09-24 | 下载 | Generative Artificial Intelligence (GenAI) applications are built from specialized components -- inference servers, object storage, vector and graph databases, and user interfaces -- interconnected vi... |
| FZModules: A Heterogeneous Computing Framework for Customizable Scientific Data Compression Pipelines | Skyler Ruiter, Jiannan Tian, Fengguang Song | 2025-09-24 | 下载 | Modern scientific simulations and instruments generate data volumes that overwhelm memory and storage, throttling scalability. Lossy compression mitigates this by trading controlled error for reduced ... |
| Adaptive Approach to Enhance Machine Learning Scheduling Algorithms During Runtime Using Reinforcement Learning in Metascheduling Applications | Samer Alshaer, Ala Khalifeh, Roman Obermaisser | 2025-09-24 | 下载 | Metascheduling in time-triggered architectures has been crucial in adapting to dynamic and unpredictable environments, ensuring the reliability and efficiency of task execution. |
| Reconstruction-Based Adaptive Scheduling Using AI Inferences in Safety-Critical Systems | Samer Alshaer, Ala Khalifeh, Roman Obermaisser | 2025-09-24 | 下载 | Adaptive scheduling is crucial for ensuring the reliability and safety of time-triggered systems (TTS) in dynamic operational environments. Scheduling frameworks face significant challenges, including... |
| xGFabric: Coupling Sensor Networks and HPC Facilities with Private 5G Wireless Networks for Real-Time Digital Agriculture | Liubov Kurafeeva, Alan Subedi, Ryan Hartung, Michael Fay, Avhishek Biswas, Shantenu Jha, Ozgur O. Kilic, Chandra Krintz, Andre Merzky, Douglas Thain, Mehmet C. Vuran, Rich Wolski | 2025-09-24 | 下载 | Advanced scientific applications require coupling distributed sensor networks with centralized high-performance computing facilities. Citrus Under Protective Screening (CUPS) exemplifies this need in ... |
| Energy Use of AI Inference: Efficiency Pathways and Test-Time Compute | Felipe Oviedo, Fiodar Kazhamiaka, Esha Choukse, Allen Kim, Amy Luers, Melanie Nakagawa, Ricardo Bianchini, Juan M. Lavista Ferres | 2025-09-24 | 下载 | As AI inference scales to billions of queries and emerging reasoning and agentic workflows increase token demand, reliable estimates of per-query energy use are increasingly important for capacity pla... |
| An Empirical Analysis of Secure Federated Learning for Autonomous Vehicle Applications | Md Jueal Mia, M. Hadi Amini | 2025-09-24 | 下载 | Federated Learning lends itself as a promising paradigm in enabling distributed learning for autonomous vehicles applications and ensuring data privacy while enhancing and refining predictive model pe... |
| Fulcrum: Optimizing Concurrent DNN Training and Inferencing on Edge Accelerators | Prashanthi S. K., Saisamarth Taluri, Pranav Gupta, Amartya Ranjan Saikia, Kunal Kumar Sahoo, Atharva Vinay Joshi, Lakshya Karwa, Kedar Dhule, Yogesh Simmhan | 2025-09-24 | 下载 | The proliferation of GPU accelerated edge devices like Nvidia Jetsons and the rise in privacy concerns are placing an emphasis on concurrent DNN training and inferencing on edge devices. |
| Pagoda: An Energy and Time Roofline Study for DNN Workloads on Edge Accelerators | Prashanthi S. K., Kunal Kumar Sahoo, Amartya Ranjan Saikia, Pranav Gupta, Atharva Vinay Joshi, Priyanshu Pansari, Yogesh Simmhan | 2025-09-24 | 下载 | Edge accelerators such as Nvidia Jetsons are becoming an integral part of the computing continuum, and are often used for DNN inferencing and training. |
| Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models | Prashanthi S. K., Sai Anuroop Kesanapalli, Yogesh Simmhan | 2025-09-24 | 下载 | Deep Neural Networks (DNNs) have had a significant impact on domains like autonomous vehicles and smart cities through low-latency inferencing on edge computing devices close to the data source. |
| BurstEngine: an Efficient Distributed Framework for Training Transformers on Extremely Long Sequences of over 1M Tokens | Ao Sun, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong sun | 2025-09-24 | 下载 | Existing methods for training LLMs on long-sequence data, such as Tensor Parallelism and Context Parallelism, exhibit low Model FLOPs Utilization as sequence lengths and number of GPUs increase, espec... |
| A Theory of Multi-Agent Generative Flow Networks | Leo Maxime Brunswic, Haozhi Wang, Shuang Luo, Jianye Hao, Amir Rasouli, Yinchuan Li | 2025-09-24 | 下载 | Generative flow networks utilize a flow-matching loss to learn a stochastic policy for generating objects from a sequence of actions, such that the probability of generating a pattern can be proportio... |
| Gyges: Dynamic Cross-Instance Parallelism Transformation for Efficient LLM Inference | Haoyu Chen, Xue Li, Kun Qian, Yu Guan, Jin Zhao, Xin Wang | 2025-09-24 | 下载 | Efficiently processing the dynamics of requests, especially the context length variance, is important in Large Language Model (LLM) serving scenarios. |
| Characterizing Adaptive Mesh Refinement on Heterogeneous Platforms with Parthenon-VIBE | Akash Poptani, Alireza Khadem, Scott Mahlke, Jonah Miller, Joshua Dolence, Reetuparna Das | 2025-09-24 | 下载 | Hero-class HPC simulations rely on Adaptive Mesh Refinement (AMR) to reduce compute and memory demands while maintaining accuracy. This work analyzes the performance of Parthenon, a block-structured A... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| An LLM-based Agentic Framework for Accessible Network Control | Samuel Lin, Jiawei Zhou, Minlan Yu | 2025-09-24 | 下载 | Traditional approaches to network management have been accessible only to a handful of highly-trained network operators with significant expert knowledge. |
| TSKAN: Interpretable Machine Learning for QoE modeling over Time Series Data | Kamal Singh, Priyanka Rawat, Sami Marouani, Baptiste Jeudy | 2025-09-24 | 下载 | Quality of Experience (QoE) modeling is crucial for optimizing video streaming services to capture the complex relationships between different features and user experience. |
| Can LLMs Forecast Internet Traffic from Social Media? | Jonatan Langlet, Mariano Scazzariello, Flavio Luciani, Marta Burocchi, Dejan Kostić, Marco Chiesa | 2025-09-24 | 下载 | Societal events shape the Internet's behavior. The death of a prominent public figure, a software launch, or a major sports match can trigger sudden demand surges that overwhelm peering points and con... |
| A Novel Short-Term Anomaly Prediction for IIoT with Software Defined Twin Network | Bilal Dalgic, Betul Sen, Muge Erel-Ozcevik | 2025-09-24 | 下载 | Secure monitoring and dynamic control in an IIoT environment are major requirements for current development goals. We believe that dynamic, secure monitoring of the IIoT environment can be achieved th... |
| Joint Ex-Post Location Calibration and Radio Map Construction under Biased Positioning Errors | Koki Kanzaki, Koya Sato | 2025-09-24 | 下载 | This paper proposes a high-accuracy radio map construction method tailored for environments where location information is affected by bursty errors. |
| SPARQ: An Optimization Framework for the Distribution of AI-Intensive Applications under Non-Linear Delay Constraints | Pietro Spadaccino, Paolo Di Lorenzo, Sergio Barbarossa, Antonia M. Tulino, Jaime Llorca | 2025-09-24 | 下载 | Next-generation real-time compute-intensive applications, such as extended reality, multi-user gaming, and autonomous transportation, are increasingly composed of heterogeneous AI-intensive functions ... |
| CollaPipe: Adaptive Segment-Optimized Pipeline Parallelism for Collaborative LLM Training in Heterogeneous Edge Networks | Jiewei Chen, Xiumei Deng, Zehui Xiong, Shaoyong Guo, Xuesong Qiu, Ping Wang, Dusit Niyato | 2025-09-24 | 下载 | The increasing demand for intelligent mobile applications has made multi-agent collaboration with Transformer-based large language models (LLMs) essential in mobile edge computing (MEC) networks. |
| Large Language Models for Real-World IoT Device Identification | Rameen Mahmood, Tousif Ahmed, Sai Teja Peddinti, Danny Yuxing Huang | 2025-09-24 | 下载 | The rapid expansion of IoT devices has outpaced current identification methods, creating significant risks for security, privacy, and network accountability. |
| Games Are Not Equal: Classifying Cloud Gaming Contexts for Effective User Experience Measurement | Yifan Wang, Minzhao Lyu, Vijay Sivaraman | 2025-09-24 | 下载 | To tap into the growing market of cloud gaming, whereby game graphics is rendered in the cloud and streamed back to the user as a video feed, network operators are creating monetizable assurance servi... |
| RIS-assisted Data Collection and Wireless Power Transfer in Low-altitude Wireless Networks | Wenwen Xie, Geng Sun, Jiahui Li, Jiacheng Wang, Yinqiu Liu, Dusit Niyato, Dong In Kim, Shiwen Mao | 2025-09-24 | 下载 | Low-altitude wireless networks (LAWNs) have become effective solutions for collecting data from low-power Internet-of-Things devices (IoTDs) in remote areas with limited communication infrastructure. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Preparation Meets Opportunity: Enhancing Data Preprocessing for ML Training With Seneca | Omkar Desai, Ziyang Jiao, Shuyi Pei, Janki Bhimani, Bryan S. Kim | 2025-09-24 | 下载 | Input data preprocessing is a common bottleneck when concurrently training multimedia machine learning (ML) models in modern systems. To alleviate these bottlenecks and reduce the training time for co... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| denet, a lightweight command-line tool for process monitoring in benchmarking and beyond | Ben Carrillo, Izaskun Mallona | 2025-09-24 | 下载 | Summary: denet is a lightweight process monitoring utility providing real-time resource profiling of running processes. denet reports CPU, memory, disk I/O, network activity, and thread usage, includi... |
| Characterizing Adaptive Mesh Refinement on Heterogeneous Platforms with Parthenon-VIBE | Akash Poptani, Alireza Khadem, Scott Mahlke, Jonah Miller, Joshua Dolence, Reetuparna Das | 2025-09-24 | 下载 | Hero-class HPC simulations rely on Adaptive Mesh Refinement (AMR) to reduce compute and memory demands while maintaining accuracy. This work analyzes the performance of Parthenon, a block-structured A... |