2025-03-08

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs	Qizhe Wu, Huawen Liang, Yuchen Gui, Zhichen Zeng, Zerong He, Linfeng Tao, Xiaotian Wang, Letian Zhao, Zhaoxi Zeng, Wei Yuan, Wei Wu, Xi Jin	2025-03-08	下载	General matrix-matrix multiplication (GEMM) is a cornerstone of AI computations, making tensor processing engines (TPEs) increasingly critical in GPUs and domain-specific architectures.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
HPDR: High-Performance Portable Scientific Data Reduction Framework	Jieyang Chen, Qian Gong, Yanliang Li, Xin Liang, Lipeng Wan, Qing Liu, Norbert Podhorszki, Scott Klasky	2025-03-08	下载	The rapid growth of scientific data is surpassing advancements in computing, creating challenges in storage, transfer, and analysis, particularly at the exascale.
VerIso: Verifiable Isolation Guarantees for Database Transactions	Shabnam Ghasemirad, Si Liu, Christoph Sprenger, Luca Multazzu, David Basin	2025-03-08	下载	Isolation bugs, stemming especially from design-level defects, have been repeatedly found in carefully designed and extensively tested production databases over decades.
Mitigating Blockchain extractable value (BEV) threats by Distributed Transaction Sequencing in Blockchains	Xiongfei Zhao, Hou-Wan Long, Zhengzhe Li, Jiangchuan Liu, Yain-Whar Si	2025-03-08	下载	The rapid growth of Blockchain and Decentralized Finance (DeFi) has introduced new challenges and vulnerabilities that threaten the integrity and efficiency of the ecosystem.
QAOA in Quantum Datacenters: Parallelization, Simulation, and Orchestration	Amana Liaqat, Ahmed Darwish, Adrian Roman, Stephen DiAdamo	2025-03-08	下载	Scaling quantum computing requires networked systems, leveraging HPC for distributed simulation now and quantum networks in the future. Quantum datacenters will be the primary access point for users, ...
GraphGen+: Advancing Distributed Subgraph Generation and Graph Learning On Industrial Graphs	Yue Jin, Yongchao Liu, Chuntao Hong	2025-03-08	下载	Graph-based computations are crucial in a wide range of applications, where graphs can scale to trillions of edges. To enable efficient training on such large graphs, mini-batch subgraph sampling is c...
Distributed Graph Neural Network Inference With Just-In-Time Compilation For Industry-Scale Graphs	Xiabao Wu, Yongchao Liu, Wei Qin, Chuntao Hong	2025-03-08	下载	Graph neural networks (GNNs) have delivered remarkable results in various fields. However, the rapid increase in the scale of graph data has introduced significant performance bottlenecks for GNN infe...
Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on Microcontrollers	Francesco Daghero, Daniele Jahier Pagliari, Francesco Conti, Luca Benini, Massimo Poncino, Alessio Burrello	2025-03-08	下载	The acceleration of pruned Deep Neural Networks (DNNs) on edge devices such as Microcontrollers (MCUs) is a challenging task, given the tight area- and power-constraints of these devices.
Momentum-based Distributed Resource Scheduling Optimization Subject to Sector-Bound Nonlinearity and Latency	Mohammadreza Doostmohammadian, Zulfiya R. Gabidullina, Hamid R. Rabiee	2025-03-08	下载	This paper proposes an accelerated consensus-based distributed iterative algorithm for resource allocation and scheduling. The proposed gradient-tracking algorithm introduces an auxiliary variable to ...
FedSem: A Resource Allocation Scheme for Federated Learning Assisted Semantic Communication	Xinyu Zhou, Yang Li, Jun Zhao	2025-03-08	下载	Semantic communication (SemCom), regarded as the evolution of the traditional Shannon's communication model, stresses the transmission of semantic information instead of the data itself.
ML-based Adaptive Prefetching and Data Placement for US HEP Systems	Venkat Sai Suman Lamba Karanam, Sarat Sasank Barla, Byrav Ramamurthy, Derek Weitzel	2025-03-08	下载	Although benefits from caching in US HEP are well-known, current caching strategies are not adaptive i.e they do not adapt to changing cache access patterns.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Synergizing AI and Digital Twins for Next-Generation Network Optimization, Forecasting, and Security	Zifan Zhang, Minghong Fang, Dianwei Chen, Xianfeng Yang, Yuchen Liu	2025-03-08	下载	Digital network twins (DNTs) are virtual representations of physical networks, designed to enable real-time monitoring, simulation, and optimization of network performance.
AmazonNetLink: Enabling Education Access in Remote Amazonian Regions through Delay-Tolerant Networks	Andrés Fernando Barón Sandoval, Milena Radenkovic	2025-03-08	下载	Access to educational materials in remote Amazonian communities is challenged by limited communication infrastructure. This paper proposes a novel delay-tolerant network (DTN) approach for content dis...
FALCON: A Framework for Fault Prediction in Open RAN Using Multi-Level Telemetry	Yaswanth Kumar LS, Somya Jain, Bheemarjuna Reddy Tamma, Koteswararao Kondepu	2025-03-08	下载	O-RAN has brought in deployment flexibility and intelligent RAN control for mobile operators through its disaggregated and modular architecture using open interfaces.
Integration of SDN and Digital Twin for the Intelligent Detection of DoC Attacks in WRSNs	Muhammad Umar Farooq Qaisar, Weijie Yuan, Guangjie Han, Adeel Ahmed, Chang Liu, Md. Jalil Piran	2025-03-08	下载	Wireless rechargeable sensor networks (WRSNs), supported by recent advancements in wireless power transfer (WPT) technology, hold significant potential for extending network lifetime.
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models	Xubin Wang, Zhiqing Tang, Jianxiong Guo, Tianhui Meng, Chenhao Wang, Tian Wang, Weijia Jia	2025-03-08	下载	The rapid advancement of artificial intelligence (AI) technologies has led to an increasing deployment of AI models on edge and terminal devices, driven by the proliferation of the Internet of Things ...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on Microcontrollers	Francesco Daghero, Daniele Jahier Pagliari, Francesco Conti, Luca Benini, Massimo Poncino, Alessio Burrello	2025-03-08	下载	The acceleration of pruned Deep Neural Networks (DNNs) on edge devices such as Microcontrollers (MCUs) is a challenging task, given the tight area- and power-constraints of these devices.