2025-07-08

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
SLDB: An End-To-End Heterogeneous System-on-Chip Benchmark Suite for LLM-Aided Design	Elisavet Lydia Alvanaki, Kevin Lee, Luca P. Carloni	2025-07-08	下载	Over the last few years, Large Language Models (LLMs) have emerged as a valuable tool for Electronic Design Automation (EDA). State-of-the-art research in LLM-aided design has demonstrated the ability...
Multi-Queue SSD I/O Modeling & Its Implications for Data Structure Design	Erin Ransom, Andrew Lim, Michael Mitzenmacher	2025-07-08	下载	Understanding the performance profiles of storage devices and how best to utilize them has always been non-trivial due to factors such as seek times, caching, scheduling, concurrent access, flash wear...
PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization	Dongsheng Zuo, Jiadong Zhu, Yang Luo, Yuzhe Ma	2025-07-08	下载	Prefix adders are fundamental arithmetic circuits, but their design space grows exponentially with bit-width, posing significant optimization challenges.
RTGPU: Real-Time Computing with Graphics Processing Units	Atiyeh Gheibi-Fetrat, Amirsaeed Ahmadi-Tonekaboni, Farzam Koohi-Ronaghi, Pariya Hajipour, Sana Babayan-Vanestan, Fatemeh Fotouhi, Elahe Mortazavian-Farsani, Pouria Khajehpour-Dezfouli, Sepideh Safari, Shaahin Hessabi, Hamid Sarbazi-Azad	2025-07-08	下载	In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, aut...
OLAF: Programmable Data Plane Acceleration for Asynchronous Distributed Reinforcement Learning	Nehal Baganal Krishna, Anam Tahir, Firas Khamis, Mina Tahmasbi Arashloo, Michael Zink, Amr Rizk	2025-07-08	下载	Asynchronous Distributed Reinforcement Learning (DRL) can suffer from degraded convergence when model updates become stale, often the result of network congestion and packet loss during large-scale tr...
GATMesh: Clock Mesh Timing Analysis using Graph Neural Networks	Muhammad Hadir Khan, Matthew Guthaus	2025-07-08	下载	Clock meshes are essential in high-performance VLSI systems for minimizing skew and handling PVT variations, but analyzing them is difficult due to reconvergent paths, multi-source driving, and input ...
iThermTroj: Exploiting Intermittent Thermal Trojans in Multi-Processor System-on-Chips	Mehdi Elahi, Mohamed R. Elshamy, Abdel-Hameed Badawy, Ahmad Patooghy	2025-07-08	下载	Thermal Trojan attacks present a pressing concern for the security and reliability of System-on-Chips (SoCs), especially in mobile applications.
Per-Row Activation Counting on Real Hardware: Demystifying Performance Overheads	Jumin Kim, Seungmin Baek, Minbok Wi, Hwayong Nam, Michael Jaemin Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn	2025-07-08	下载	Per-Row Activation Counting (PRAC), a DRAM read disturbance mitigation method, modifies key DRAM timing parameters, reportedly causing significant performance overheads in simulator-based studies.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
FedPhD: Federated Pruning with Hierarchical Learning of Diffusion Models	Qianyu Long, Qiyuan Wang, Christos Anagnostopoulos, Daning Bi	2025-07-08	下载	Federated Learning (FL), as a distributed learning paradigm, trains models over distributed clients' data. FL is particularly beneficial for distributed training of Diffusion Models (DMs), which are h...
Ampere: Communication-Efficient and High-Accuracy Split Federated Learning	Zihan Zhang, Leon Wong, Blesson Varghese	2025-07-08	下载	A Federated Learning (FL) system collaboratively trains neural networks across devices and a server but is limited by significant on-device computation costs.
A Unified Ontology for Scalable Knowledge Graph-Driven Operational Data Analytics in High-Performance Computing Systems	Junaid Ahmed Khan, Andrea Bartolini	2025-07-08	下载	Modern high-performance computing (HPC) systems generate massive volumes of heterogeneous telemetry data from millions of sensors monitoring compute, memory, power, cooling, and storage subsystems.
Resolving Extreme Data Scarcity by Explicit Physics Integration: An Application to Groundwater Heat Transport	Julia Pelzer, Corné Verburg, Alexander Heinlein, Miriam Schulte	2025-07-08	下载	Real-world flow applications in complex scientific and engineering domains, such as geosciences, challenge classical simulation methods due to large spatial domains, high spatio-temporal resolution re...
Efficient Federated Learning with Timely Update Dissemination	Juncheng Jia, Ji Liu, Chao Huo, Yihui Shen, Yang Zhou, Huaiyu Dai, Dejing Dou	2025-07-08	下载	Federated Learning (FL) has emerged as a compelling methodology for the management of distributed data, marked by significant advancements in recent years.
ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge	Daghash K. Alqahtani, Maria A. Rodriguez, Muhammad Aamir Cheema, Hamid Rezatofighi, Adel N. Toosi	2025-07-08	下载	Edge computing enables data processing closer to the source, significantly reducing latency, an essential requirement for real-time vision-based analytics such as object detection in surveillance and ...
Towards Serverless Processing of Spatiotemporal Big Data Queries	Diana Baumann, Tim C. Rese, David Bermbach	2025-07-08	下载	Spatiotemporal data are being produced in continuously growing volumes by a variety of data sources and a variety of application fields rely on rapid analysis of such data.
Precomputed Dominant Resource Fairness	Serdar Metin	2025-07-08	下载	Although resource allocation is a well studied problem in computer science, until the prevalence of distributed systems, such as computing clouds and data centres, the question had been addressed pred...
A Formal Refutation of the Blockchain Trilemma	Craig Wright	2025-07-08	下载	The so-called blockchain trilemma asserts the impossibility of simultaneously achieving scalability, security, and decentralisation within a single blockchain protocol.
Air-FedGA: A Grouping Asynchronous Federated Learning Mechanism Exploiting Over-the-air Computation	Qianpiao Ma, Junlong Zhou, Xiangpeng Hou, Jianchun Liu, Hongli Xu, Jianeng Miao, Qingmin Jia	2025-07-08	下载	Federated learning (FL) is a new paradigm to train AI models over distributed edge devices (i.e., workers) using their local data, while confronting various challenges including communication resource...
AAPA: An Archetype-Aware Predictive Autoscaler with Uncertainty Quantification for Serverless Workloads on Kubernetes	Guilin Zhang, Srinivas Vippagunta, Raghavendra Nandagopal, Suchitra Raman, Jeff Xu, Marcus Pfeiffer, Shreeshankar Chatterjee, Ziqi Tan, Wulan Guo, Hailong Jiang	2025-07-08	下载	Serverless platforms such as Kubernetes are increasingly adopted in high-performance computing, yet autoscaling remains challenging under highly dynamic and heterogeneous workloads.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
One task to rule them all: A closer look at traffic classification generalizability	Elham Akbari, Zihao Zhou, Mohammad Ali Salahuddin, Noura Limam, Raouf Boutaba, Bertrand Mathieu, Stephanie Moteau, Stephane Tuffin	2025-07-08	下载	Existing website fingerprinting and traffic classification solutions do not work well when the evaluation context changes, as their performances often heavily rely on context-specific assumptions.
Programmable Governance for Group-Controlled Decentralized Identifiers	Carlo Segat, Sandro Rodriguez Garzo, Axel Küpper	2025-07-08	下载	Self-Sovereign Identity (SSI) is a paradigm for digital identity management that offers unique privacy advantages. A key technology in SSI is Decentralized Identifiers (DIDs) and their associated meta...
OLAF: Programmable Data Plane Acceleration for Asynchronous Distributed Reinforcement Learning	Nehal Baganal Krishna, Anam Tahir, Firas Khamis, Mina Tahmasbi Arashloo, Michael Zink, Amr Rizk	2025-07-08	下载	Asynchronous Distributed Reinforcement Learning (DRL) can suffer from degraded convergence when model updates become stale, often the result of network congestion and packet loss during large-scale tr...
Intra-DP: A High Performance Collaborative Inference System for Mobile Edge Computing	Zekai Sun, Xiuxian Guan, Zheng Lin, Zihan Fang, Xiangming Cai, Zhe Chen, Fangming Liu, Heming Cui, Jie Xiong, Wei Ni, Chau Yuen	2025-07-08	下载	Deploying deep neural networks (DNNs) on resource-constrained mobile devices presents significant challenges, particularly in achieving real-time performance while simultaneously coping with limited c...
A Satellite-Ground Synergistic Large Vision-Language Model System for Earth Observation	Yuxin Zhang, Jiahao Yang, Zhe Chen, Wenjun Zhu, Jin Zhao, Yue Gao	2025-07-08	下载	Recently, large vision-language models (LVLMs) unleash powerful analysis capabilities for low Earth orbit (LEO) satellite Earth observation images in the data center.
Baton: Compensate for Missing Wi-Fi Features for Practical Device-free Tracking	Yiming Zhao, Xuanqi Meng, Xinyu Tong, Xiulong Liu, Xin Xie, Wenyu Qu	2025-07-08	下载	Wi-Fi contact-free sensing systems have attracted widespread attention due to their ubiquity and convenience. The integrated sensing and communication (ISAC) technology utilizes off-the-shelf Wi-Fi co...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
gigiProfiler: Diagnosing Performance Issues by Uncovering Application Resource Bottlenecks	Yigong Hu, Haodong Zheng, Yicheng Liu, Dedong Xie, Youliang Huang, Baris Kasikci	2025-07-08	下载	Diagnosing performance bottlenecks in modern software is essential yet challenging, particularly as applications become more complex and rely on custom resource management policies.
Adaptive Communication Through Exploiting RIS, SSK, and CIM for Improved Reliability and Efficiency	Ferhat Bayar, Onur Salan, Erdogan Aydin, Haci Ilhan	2025-07-08	下载	In this paper, we present a novel communication system model that integrates reconfigurable intelligent surfaces (RIS), spatial shift keying (SSK), and code index modulation (CIM) based on Hadamard co...