Skip to content

2025-07-08

cs.AR - Architecture

标题作者发布日期PDF摘要
SLDB: An End-To-End Heterogeneous System-on-Chip Benchmark Suite for LLM-Aided DesignElisavet Lydia Alvanaki, Kevin Lee, Luca P. Carloni2025-07-08下载Over the last few years, Large Language Models (LLMs) have emerged as a valuable tool for Electronic Design Automation (EDA). State-of-the-art research in LLM-aided design has demonstrated the ability...
Multi-Queue SSD I/O Modeling & Its Implications for Data Structure DesignErin Ransom, Andrew Lim, Michael Mitzenmacher2025-07-08下载Understanding the performance profiles of storage devices and how best to utilize them has always been non-trivial due to factors such as seek times, caching, scheduling, concurrent access, flash wear...
PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder OptimizationDongsheng Zuo, Jiadong Zhu, Yang Luo, Yuzhe Ma2025-07-08下载Prefix adders are fundamental arithmetic circuits, but their design space grows exponentially with bit-width, posing significant optimization challenges.
RTGPU: Real-Time Computing with Graphics Processing UnitsAtiyeh Gheibi-Fetrat, Amirsaeed Ahmadi-Tonekaboni, Farzam Koohi-Ronaghi, Pariya Hajipour, Sana Babayan-Vanestan, Fatemeh Fotouhi, Elahe Mortazavian-Farsani, Pouria Khajehpour-Dezfouli, Sepideh Safari, Shaahin Hessabi, Hamid Sarbazi-Azad2025-07-08下载In this work, we survey the role of GPUs in real-time systems. Originally designed for parallel graphics workloads, GPUs are now widely used in time-critical applications such as machine learning, aut...
OLAF: Programmable Data Plane Acceleration for Asynchronous Distributed Reinforcement LearningNehal Baganal Krishna, Anam Tahir, Firas Khamis, Mina Tahmasbi Arashloo, Michael Zink, Amr Rizk2025-07-08下载Asynchronous Distributed Reinforcement Learning (DRL) can suffer from degraded convergence when model updates become stale, often the result of network congestion and packet loss during large-scale tr...
GATMesh: Clock Mesh Timing Analysis using Graph Neural NetworksMuhammad Hadir Khan, Matthew Guthaus2025-07-08下载Clock meshes are essential in high-performance VLSI systems for minimizing skew and handling PVT variations, but analyzing them is difficult due to reconvergent paths, multi-source driving, and input ...
iThermTroj: Exploiting Intermittent Thermal Trojans in Multi-Processor System-on-ChipsMehdi Elahi, Mohamed R. Elshamy, Abdel-Hameed Badawy, Ahmad Patooghy2025-07-08下载Thermal Trojan attacks present a pressing concern for the security and reliability of System-on-Chips (SoCs), especially in mobile applications.
Per-Row Activation Counting on Real Hardware: Demystifying Performance OverheadsJumin Kim, Seungmin Baek, Minbok Wi, Hwayong Nam, Michael Jaemin Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn2025-07-08下载Per-Row Activation Counting (PRAC), a DRAM read disturbance mitigation method, modifies key DRAM timing parameters, reportedly causing significant performance overheads in simulator-based studies.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
FedPhD: Federated Pruning with Hierarchical Learning of Diffusion ModelsQianyu Long, Qiyuan Wang, Christos Anagnostopoulos, Daning Bi2025-07-08下载Federated Learning (FL), as a distributed learning paradigm, trains models over distributed clients' data. FL is particularly beneficial for distributed training of Diffusion Models (DMs), which are h...
Ampere: Communication-Efficient and High-Accuracy Split Federated LearningZihan Zhang, Leon Wong, Blesson Varghese2025-07-08下载A Federated Learning (FL) system collaboratively trains neural networks across devices and a server but is limited by significant on-device computation costs.
A Unified Ontology for Scalable Knowledge Graph-Driven Operational Data Analytics in High-Performance Computing SystemsJunaid Ahmed Khan, Andrea Bartolini2025-07-08下载Modern high-performance computing (HPC) systems generate massive volumes of heterogeneous telemetry data from millions of sensors monitoring compute, memory, power, cooling, and storage subsystems.
Resolving Extreme Data Scarcity by Explicit Physics Integration: An Application to Groundwater Heat TransportJulia Pelzer, Corné Verburg, Alexander Heinlein, Miriam Schulte2025-07-08下载Real-world flow applications in complex scientific and engineering domains, such as geosciences, challenge classical simulation methods due to large spatial domains, high spatio-temporal resolution re...
Efficient Federated Learning with Timely Update DisseminationJuncheng Jia, Ji Liu, Chao Huo, Yihui Shen, Yang Zhou, Huaiyu Dai, Dejing Dou2025-07-08下载Federated Learning (FL) has emerged as a compelling methodology for the management of distributed data, marked by significant advancements in recent years.
ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the EdgeDaghash K. Alqahtani, Maria A. Rodriguez, Muhammad Aamir Cheema, Hamid Rezatofighi, Adel N. Toosi2025-07-08下载Edge computing enables data processing closer to the source, significantly reducing latency, an essential requirement for real-time vision-based analytics such as object detection in surveillance and ...
Towards Serverless Processing of Spatiotemporal Big Data QueriesDiana Baumann, Tim C. Rese, David Bermbach2025-07-08下载Spatiotemporal data are being produced in continuously growing volumes by a variety of data sources and a variety of application fields rely on rapid analysis of such data.
Precomputed Dominant Resource FairnessSerdar Metin2025-07-08下载Although resource allocation is a well studied problem in computer science, until the prevalence of distributed systems, such as computing clouds and data centres, the question had been addressed pred...
A Formal Refutation of the Blockchain TrilemmaCraig Wright2025-07-08下载The so-called blockchain trilemma asserts the impossibility of simultaneously achieving scalability, security, and decentralisation within a single blockchain protocol.
Air-FedGA: A Grouping Asynchronous Federated Learning Mechanism Exploiting Over-the-air ComputationQianpiao Ma, Junlong Zhou, Xiangpeng Hou, Jianchun Liu, Hongli Xu, Jianeng Miao, Qingmin Jia2025-07-08下载Federated learning (FL) is a new paradigm to train AI models over distributed edge devices (i.e., workers) using their local data, while confronting various challenges including communication resource...
AAPA: An Archetype-Aware Predictive Autoscaler with Uncertainty Quantification for Serverless Workloads on KubernetesGuilin Zhang, Srinivas Vippagunta, Raghavendra Nandagopal, Suchitra Raman, Jeff Xu, Marcus Pfeiffer, Shreeshankar Chatterjee, Ziqi Tan, Wulan Guo, Hailong Jiang2025-07-08下载Serverless platforms such as Kubernetes are increasingly adopted in high-performance computing, yet autoscaling remains challenging under highly dynamic and heterogeneous workloads.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
One task to rule them all: A closer look at traffic classification generalizabilityElham Akbari, Zihao Zhou, Mohammad Ali Salahuddin, Noura Limam, Raouf Boutaba, Bertrand Mathieu, Stephanie Moteau, Stephane Tuffin2025-07-08下载Existing website fingerprinting and traffic classification solutions do not work well when the evaluation context changes, as their performances often heavily rely on context-specific assumptions.
Programmable Governance for Group-Controlled Decentralized IdentifiersCarlo Segat, Sandro Rodriguez Garzo, Axel Küpper2025-07-08下载Self-Sovereign Identity (SSI) is a paradigm for digital identity management that offers unique privacy advantages. A key technology in SSI is Decentralized Identifiers (DIDs) and their associated meta...
OLAF: Programmable Data Plane Acceleration for Asynchronous Distributed Reinforcement LearningNehal Baganal Krishna, Anam Tahir, Firas Khamis, Mina Tahmasbi Arashloo, Michael Zink, Amr Rizk2025-07-08下载Asynchronous Distributed Reinforcement Learning (DRL) can suffer from degraded convergence when model updates become stale, often the result of network congestion and packet loss during large-scale tr...
Intra-DP: A High Performance Collaborative Inference System for Mobile Edge ComputingZekai Sun, Xiuxian Guan, Zheng Lin, Zihan Fang, Xiangming Cai, Zhe Chen, Fangming Liu, Heming Cui, Jie Xiong, Wei Ni, Chau Yuen2025-07-08下载Deploying deep neural networks (DNNs) on resource-constrained mobile devices presents significant challenges, particularly in achieving real-time performance while simultaneously coping with limited c...
A Satellite-Ground Synergistic Large Vision-Language Model System for Earth ObservationYuxin Zhang, Jiahao Yang, Zhe Chen, Wenjun Zhu, Jin Zhao, Yue Gao2025-07-08下载Recently, large vision-language models (LVLMs) unleash powerful analysis capabilities for low Earth orbit (LEO) satellite Earth observation images in the data center.
Baton: Compensate for Missing Wi-Fi Features for Practical Device-free TrackingYiming Zhao, Xuanqi Meng, Xinyu Tong, Xiulong Liu, Xin Xie, Wenyu Qu2025-07-08下载Wi-Fi contact-free sensing systems have attracted widespread attention due to their ubiquity and convenience. The integrated sensing and communication (ISAC) technology utilizes off-the-shelf Wi-Fi co...

cs.PF - Performance

标题作者发布日期PDF摘要
gigiProfiler: Diagnosing Performance Issues by Uncovering Application Resource BottlenecksYigong Hu, Haodong Zheng, Yicheng Liu, Dedong Xie, Youliang Huang, Baris Kasikci2025-07-08下载Diagnosing performance bottlenecks in modern software is essential yet challenging, particularly as applications become more complex and rely on custom resource management policies.
Adaptive Communication Through Exploiting RIS, SSK, and CIM for Improved Reliability and EfficiencyFerhat Bayar, Onur Salan, Erdogan Aydin, Haci Ilhan2025-07-08下载In this paper, we present a novel communication system model that integrates reconfigurable intelligent surfaces (RIS), spatial shift keying (SSK), and code index modulation (CIM) based on Hadamard co...

基于 VitePress 构建