Skip to content

2025-03-08

cs.AR - Architecture

标题作者发布日期PDF摘要
Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACsQizhe Wu, Huawen Liang, Yuchen Gui, Zhichen Zeng, Zerong He, Linfeng Tao, Xiaotian Wang, Letian Zhao, Zhaoxi Zeng, Wei Yuan, Wei Wu, Xi Jin2025-03-08下载General matrix-matrix multiplication (GEMM) is a cornerstone of AI computations, making tensor processing engines (TPEs) increasingly critical in GPUs and domain-specific architectures.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
HPDR: High-Performance Portable Scientific Data Reduction FrameworkJieyang Chen, Qian Gong, Yanliang Li, Xin Liang, Lipeng Wan, Qing Liu, Norbert Podhorszki, Scott Klasky2025-03-08下载The rapid growth of scientific data is surpassing advancements in computing, creating challenges in storage, transfer, and analysis, particularly at the exascale.
VerIso: Verifiable Isolation Guarantees for Database TransactionsShabnam Ghasemirad, Si Liu, Christoph Sprenger, Luca Multazzu, David Basin2025-03-08下载Isolation bugs, stemming especially from design-level defects, have been repeatedly found in carefully designed and extensively tested production databases over decades.
Mitigating Blockchain extractable value (BEV) threats by Distributed Transaction Sequencing in BlockchainsXiongfei Zhao, Hou-Wan Long, Zhengzhe Li, Jiangchuan Liu, Yain-Whar Si2025-03-08下载The rapid growth of Blockchain and Decentralized Finance (DeFi) has introduced new challenges and vulnerabilities that threaten the integrity and efficiency of the ecosystem.
QAOA in Quantum Datacenters: Parallelization, Simulation, and OrchestrationAmana Liaqat, Ahmed Darwish, Adrian Roman, Stephen DiAdamo2025-03-08下载Scaling quantum computing requires networked systems, leveraging HPC for distributed simulation now and quantum networks in the future. Quantum datacenters will be the primary access point for users, ...
GraphGen+: Advancing Distributed Subgraph Generation and Graph Learning On Industrial GraphsYue Jin, Yongchao Liu, Chuntao Hong2025-03-08下载Graph-based computations are crucial in a wide range of applications, where graphs can scale to trillions of edges. To enable efficient training on such large graphs, mini-batch subgraph sampling is c...
Distributed Graph Neural Network Inference With Just-In-Time Compilation For Industry-Scale GraphsXiabao Wu, Yongchao Liu, Wei Qin, Chuntao Hong2025-03-08下载Graph neural networks (GNNs) have delivered remarkable results in various fields. However, the rapid increase in the scale of graph data has introduced significant performance bottlenecks for GNN infe...
Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on MicrocontrollersFrancesco Daghero, Daniele Jahier Pagliari, Francesco Conti, Luca Benini, Massimo Poncino, Alessio Burrello2025-03-08下载The acceleration of pruned Deep Neural Networks (DNNs) on edge devices such as Microcontrollers (MCUs) is a challenging task, given the tight area- and power-constraints of these devices.
Momentum-based Distributed Resource Scheduling Optimization Subject to Sector-Bound Nonlinearity and LatencyMohammadreza Doostmohammadian, Zulfiya R. Gabidullina, Hamid R. Rabiee2025-03-08下载This paper proposes an accelerated consensus-based distributed iterative algorithm for resource allocation and scheduling. The proposed gradient-tracking algorithm introduces an auxiliary variable to ...
FedSem: A Resource Allocation Scheme for Federated Learning Assisted Semantic CommunicationXinyu Zhou, Yang Li, Jun Zhao2025-03-08下载Semantic communication (SemCom), regarded as the evolution of the traditional Shannon's communication model, stresses the transmission of semantic information instead of the data itself.
ML-based Adaptive Prefetching and Data Placement for US HEP SystemsVenkat Sai Suman Lamba Karanam, Sarat Sasank Barla, Byrav Ramamurthy, Derek Weitzel2025-03-08下载Although benefits from caching in US HEP are well-known, current caching strategies are not adaptive i.e they do not adapt to changing cache access patterns.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Synergizing AI and Digital Twins for Next-Generation Network Optimization, Forecasting, and SecurityZifan Zhang, Minghong Fang, Dianwei Chen, Xianfeng Yang, Yuchen Liu2025-03-08下载Digital network twins (DNTs) are virtual representations of physical networks, designed to enable real-time monitoring, simulation, and optimization of network performance.
AmazonNetLink: Enabling Education Access in Remote Amazonian Regions through Delay-Tolerant NetworksAndrés Fernando Barón Sandoval, Milena Radenkovic2025-03-08下载Access to educational materials in remote Amazonian communities is challenged by limited communication infrastructure. This paper proposes a novel delay-tolerant network (DTN) approach for content dis...
FALCON: A Framework for Fault Prediction in Open RAN Using Multi-Level TelemetryYaswanth Kumar LS, Somya Jain, Bheemarjuna Reddy Tamma, Koteswararao Kondepu2025-03-08下载O-RAN has brought in deployment flexibility and intelligent RAN control for mobile operators through its disaggregated and modular architecture using open interfaces.
Integration of SDN and Digital Twin for the Intelligent Detection of DoC Attacks in WRSNsMuhammad Umar Farooq Qaisar, Weijie Yuan, Guangjie Han, Adeel Ahmed, Chang Liu, Md. Jalil Piran2025-03-08下载Wireless rechargeable sensor networks (WRSNs), supported by recent advancements in wireless power transfer (WPT) technology, hold significant potential for extending network lifetime.
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI ModelsXubin Wang, Zhiqing Tang, Jianxiong Guo, Tianhui Meng, Chenhao Wang, Tian Wang, Weijia Jia2025-03-08下载The rapid advancement of artificial intelligence (AI) technologies has led to an increasing deployment of AI models on edge and terminal devices, driven by the proliferation of the Internet of Things ...

cs.PF - Performance

标题作者发布日期PDF摘要
Lightweight Software Kernels and Hardware Extensions for Efficient Sparse Deep Neural Networks on MicrocontrollersFrancesco Daghero, Daniele Jahier Pagliari, Francesco Conti, Luca Benini, Massimo Poncino, Alessio Burrello2025-03-08下载The acceleration of pruned Deep Neural Networks (DNNs) on edge devices such as Microcontrollers (MCUs) is a challenging task, given the tight area- and power-constraints of these devices.

基于 VitePress 构建