2025-11-10

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing	Yuzhe Fu, Changchun Zhou, Hancheng Ye, Bowen Duan, Qiyu Huang, Chiyue Wei, Cong Guo, Hai "Helen'' Li, Yiran Chen	2025-11-10	下载	Three-dimensional (3D) point clouds are increasingly used in applications such as autonomous driving, robotics, and virtual reality (VR). Point-based neural networks (PNNs) have demonstrated strong pe...
ZeroSim: Zero-Shot Analog Circuit Evaluation with Unified Transformer Embeddings	Xiaomeng Yang, Jian Gao, Yanzhi Wang, Xuan Zhang	2025-11-10	下载	Although recent advancements in learning-based analog circuit design automation have tackled tasks such as topology generation, device sizing, and layout synthesis, efficient performance evaluation re...
Decoupled Control Flow and Data Access in RISC-V GPGPUs	Giuseppe M. Sarda, Nimish Shah, Abubakr Nada, Debjyoti Bhattacharjee, Marian Verhelst	2025-11-10	下载	Vortex, a newly proposed open-source GPGPU platform based on the RISC-V ISA, offers a valid alternative for GPGPU research over the broadly-used modeling platforms based on commercial GPUs.
FPGA-Accelerated RISC-V ISA Extensions for Efficient Neural Network Inference on Edge Devices	Arya Parameshwara, Santosh Hanamappa Mokashi	2025-11-10	下载	Edge AI deployment faces critical challenges balancing computational performance, energy efficiency, and resource constraints. This paper presents FPGA-accelerated RISC-V instruction set architecture ...
Optimizing GEMM for Energy and Performance on Versal ACAP Architectures	Ilias Papalamprou, Dimosthenis Masouros, Ioannis Loudaros, Francky Catthoor, Dimitrios Soudris	2025-11-10	下载	General Matrix Multiplication (GEMM) is a fundamental operation in many scientific workloads, signal processing, and particularly deep learning.
P3-LLM: An Integrated NPU-PIM Accelerator for LLM Inference Using Hybrid Numerical Formats	Yuzong Chen, Chao Fang, Xilai Dai, Yuheng Wu, Thierry Tambe, Marian Verhelst, Mohamed S. Abdelfattah	2025-11-10	下载	The substantial memory bandwidth and computational demands of large language models (LLMs) present critical challenges for efficient inference.
ASTER: Attention-based Spiking Transformer Engine for Event-driven Reasoning	Tamoghno Das, Khanh Phan Vu, Hanning Chen, Hyunwoo Oh, Mohsen Imani	2025-11-10	下载	The integration of spiking neural networks (SNNs) with transformer-based architectures has opened new opportunities for bio-inspired low-power, event-driven visual reasoning on edge devices.
Reconfigurable Quantum Instruction Set Computers for High Performance Attainable on Hardware	Zhaohui Yang, Dawei Ding, Qi Ye, Cupjin Huang, Jianxin Chen, Yuan Xie	2025-11-10	下载	The performance of current quantum hardware is severely limited. While expanding the quantum ISA with high-fidelity, expressive basis gates is a key path forward, it imposes significant gate calibrati...
Preemption-Enhanced Benchmark Suite for FPGAs	Arsalan Ali Malik, John Buchanan, Aydin Aysu	2025-11-10	下载	Field-Programmable Gate Arrays (FPGAs) have become essential in cloud computing due to their reconfigurability, energy efficiency, and ability to accelerate domain-specific workloads.
Hardware-Aware Neural Network Compilation with Learned Optimization: A RISC-V Accelerator Approach	Ravindra Ganti, Steve Xu	2025-11-10	下载	We present XgenSilicon ML Compiler, a fully automated end-to-end compilation framework that transforms high-level machine learning models into optimized RISC-V assembly code for custom ASIC accelerato...
EONSim: An NPU Simulator for On-Chip Memory and Embedding Vector Operations	Sangun Choi, Yunho Oh	2025-11-10	下载	Embedding vector operations are a key component of modern deep neural network workloads. Unlike matrix operations with deterministic access patterns, embedding vector operations exhibit input data-dep...
DMA Collectives for Efficient ML Communication Offloads	Suchita Pati, Mahzabeen Islam, Shaizeen Aga, Mohamed Assem Ibrahim	2025-11-10	下载	Offloading machine learning (ML) communication collectives to direct memory access (DMA) engines has emerged as an interesting and low-cost solution to efficiently overlap computation and communicatio...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
SemanticForge: Repository-Level Code Generation through Semantic Knowledge Graphs and Constraint Satisfaction	Wuyang Zhang, Chenkai Zhang, Zhen Luo, Jianming Ma, Wangming Yuan, Chuqiao Gu, Chenwei Feng	2025-11-10	下载	Large language models (LLMs) have transformed software development by enabling automated code generation, yet they frequently suffer from systematic errors that limit practical deployment.
HyProv: Hybrid Provenance Management for Scientific Workflows	Vasilis Bountris, Lauritz Thamsen, Ulf Leser	2025-11-10	下载	Provenance plays a crucial role in scientific workflow execution, for instance by providing data for failure analysis, real-time monitoring, or statistics on resource utilization for right-sizing allo...
Lightning Grasp: High Performance Procedural Grasp Synthesis with Contact Fields	Zhao-Heng Yin, Pieter Abbeel	2025-11-10	下载	Despite years of research, real-time diverse grasp synthesis for dexterous hands remains an unsolved core challenge in robotics and computer graphics.
LLMServingSim2.0: A Unified Simulator for Heterogeneous Hardware and Serving Techniques in LLM Infrastructure	Jaehong Cho, Hyunmin Choi, Jongse Park	2025-11-10	下载	This paper introduces LLMServingSim2.0, a system simulator designed for exploring heterogeneous hardware in large-scale LLM serving systems. LLMServingSim2.
Resilient by Design -- Active Inference for Distributed Continuum Intelligence	Praveen Kumar Donta, Alfreds Lapkovskis, Enzo Mingozzi, Schahram Dustdar	2025-11-10	下载	Failures are the norm in highly complex and heterogeneous devices spanning the distributed computing continuum (DCC), from resource-constrained IoT and edge nodes to high-performance computing systems...
A GPU-boosted high-performance multi-working condition joint analysis framework for predicting dynamics of textured axial piston pump	Xin Yao, Yang Liu, Jin Jiang, Yesen Chen, Zhilong Chen, Hongkang Dong, Xiaofeng Wei, Teng Zhang, Dongyun Wang	2025-11-10	下载	Accurate simulation to dynamics of axial piston pump (APP) is essential for its design, manufacture and maintenance. However, limited by computation capacity of CPU device and traditional solvers, con...
Wireless Sensor Networks Nodes Clustering and Optimization Based on Fuzzy C-Means and Water Strider Algorithms	Raya Majid Alsharfa, Mahmood Mohassel Feghhi, Majid Hameed Majeed	2025-11-10	下载	Wireless sensor networks (WSNs) face critical challenges in energy management and network lifetime optimization due to limited battery resources and communication overhead.
Argus: Quality-Aware High-Throughput Text-to-Image Inference Serving System	Shubham Agarwal, Subrata Mitra, Saud Iqbal	2025-11-10	下载	Text-to-image (T2I) models have gained significant popularity. Most of these are diffusion models with unique computational characteristics, distinct from both traditional small-scale ML models and la...
DMA Collectives for Efficient ML Communication Offloads	Suchita Pati, Mahzabeen Islam, Shaizeen Aga, Mohamed Assem Ibrahim	2025-11-10	下载	Offloading machine learning (ML) communication collectives to direct memory access (DMA) engines has emerged as an interesting and low-cost solution to efficiently overlap computation and communicatio...
Saarthi: An End-to-End Intelligent Platform for Optimising Distributed Serverless Workloads	Siddharth Agarwal, Maria A. Rodriguez, Rajkumar Buyya	2025-11-10	下载	FaaS offers significant advantages with its infrastructure abstraction, on-demand execution, and attractive no idle resource pricing for modern cloud applications.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
UAV-Assisted Resilience in 6G and Beyond Network Energy Saving: A Multi-Agent DRL Approach	Dao Lan Vy Dinh, Anh Nguyen Thi Mai, Hung Tran, Giang Quynh Le Vu, Tu Dac Ho, Zhenni Pan, Vo Nhan Van, Symeon Chatzinotas, Dinh-Hieu Tran	2025-11-10	下载	This paper investigates the unmanned aerial vehicle (UAV)-assisted resilience perspective in the 6G network energy saving (NES) scenario. More specifically, we consider multiple ground base stations (...
When Intelligence Overloads Infrastructure: A Forecast Model for AI-Driven Bottlenecks	Gamal Refai-Ahmed, Mallik Tatipamula, Victor Zhirnov, Ahmed Refaey Hussein, Abdallah Shami	2025-11-10	下载	The exponential growth of AI agents and connected devices fundamentally transforms the structure and capacity demands of global digital infrastructure.
Resilient by Design -- Active Inference for Distributed Continuum Intelligence	Praveen Kumar Donta, Alfreds Lapkovskis, Enzo Mingozzi, Schahram Dustdar	2025-11-10	下载	Failures are the norm in highly complex and heterogeneous devices spanning the distributed computing continuum (DCC), from resource-constrained IoT and edge nodes to high-performance computing systems...
Improving Remote Patient Monitoring Systems Using a Fog-based IoT Platform with Speech Recognition	Marc Jayson Baucas, Petros Spachos	2025-11-10	下载	Due to the recent shortage of resources in the healthcare industry, Remote Patient Monitoring (RPM) systems arose to establish a convenient alternative for accessing healthcare services remotely.
Graph Representation-based Model Poisoning on the Heterogeneous Internet of Agents	Hanlin Cai, Houtianfu Wang, Haofan Dong, Kai Li, Sai Zou, Ozgur B. Akan	2025-11-10	下载	Internet of Agents (IoA) envisions a unified, agent-centric paradigm where heterogeneous large language model (LLM) agents can interconnect and collaborate at scale.
Fine-Tuning Diffusion-Based Recommender Systems via Reinforcement Learning with Reward Function Optimization	Yu Hou, Hua Li, Ha Young Kim, Won-Yong Shin	2025-11-10	下载	Diffusion models recently emerged as a powerful paradigm for recommender systems, offering state-of-the-art performance by modeling the generative process of user-item interactions.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
GoCkpt: Gradient-Assisted Multi-Step overlapped Checkpointing for Efficient LLM Training	Keyao Zhang, Yiquan Chen, Zhuo Hu, Wenhai Lin, Jiexiong Xu, Wenzhi Chen	2025-11-10	下载	The accuracy of large language models (LLMs) improves with increasing model size, but increasing model complexity also poses significant challenges to training stability.
Preemption-Enhanced Benchmark Suite for FPGAs	Arsalan Ali Malik, John Buchanan, Aydin Aysu	2025-11-10	下载	Field-Programmable Gate Arrays (FPGAs) have become essential in cloud computing due to their reconfigurability, energy efficiency, and ability to accelerate domain-specific workloads.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Energy Consumption of Dataframe Libraries for End-to-End Deep Learning Pipelines:A Comparative Analysis	Punit Kumar, Asif Imran, Tevfik Kosar	2025-11-10	下载	This paper presents a detailed comparative analysis of the performance of three major Python data manipulation libraries - Pandas, Polars, and Dask - specifically when embedded within complete deep le...