2025-04-19

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
RedMulE-FT: A Reconfigurable Fault-Tolerant Matrix Multiplication Engine	Philip Wiese, Maurus Item, Luca Bertaccini, Yvan Tortorella, Angelo Garofalo, Luca Benini	2025-04-19	下载	As safety-critical applications increasingly rely on data-parallel floating-point computations, there is an increasing need for flexible and configurable fault tolerance in parallel floating-point acc...
Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator	Akshat Ramachandran, Souvik Kundu, Arnab Raha, Shamik Kundu, Deepak K. Mathaikutty, Tushar Krishna	2025-04-19	下载	Large language model (LLM) pruning with fixed N:M structured sparsity significantly limits the expressivity of the sparse model, yielding sub-optimal performance.
Improving the Serving Performance of Multi-LoRA Large Language Models via Efficient LoRA and KV Cache Management	Hang Zhang, Jiuchen Shi, Yixiao Wang, Quan Chen, Yizhou Shan, Minyi Guo	2025-04-19	下载	Multiple Low-Rank Adapters (Multi-LoRAs) are gaining popularity for task-specific Large Language Model (LLM) applications. For multi-LoRA serving, caching hot KV caches and LoRA adapters in high bandw...
FGMP: Fine-Grained Mixed-Precision Weight and Activation Quantization for Hardware-Accelerated LLM Inference	Coleman Hooper, Charbel Sakr, Ben Keller, Rangharajan Venkatesan, Kurt Keutzer, Sophia Shao, Brucek Khailany	2025-04-19	下载	Quantization is a powerful tool to improve large language model (LLM) inference efficiency by utilizing more energy-efficient low-precision datapaths and reducing memory footprint.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
A fast MPI-based Distributed Hash-Table as Surrogate Model demonstrated in a coupled reactive transport HPC simulation	Max Lübke, Marco De Lucia, Steffen Christgau, Stefan Petri, Bettina Schnor	2025-04-19	下载	Surrogate models can play a pivotal role in enhancing performance in contemporary High-Performance Computing applications. Cache-based surrogates use already calculated simulation results to interpola...
Decentralization in PoS Blockchain Consensus: Quantification and Advancement	Shashank Motepalli, Hans-Arno Jacobsen	2025-04-19	下载	Decentralization is a foundational principle of permissionless blockchains, with consensus mechanisms serving a critical role in its realization.
A parallel implementation of reduced-order modeling of large-scale systems	Ionut-Gabriel Farcas, Rayomand P. Gundevia, Ramakanth Munipalli, Karen E. Willcox	2025-04-19	下载	Motivated by the large-scale nature of modern aerospace engineering simulations, this paper presents a detailed description of distributed Operator Inference (dOpInf), a recently developed parallel al...
Advancing Polyglot Big Data Processing using the Hadoop ecosystem	Antony Seabra, Sergio Lifschitz	2025-04-19	下载	This article explores the utilization of the Hadoop ecosystem as a polyglot big data processing platform, focusing on the integration of diverse computation and storage technologies and their potentia...
Towards Polyglot Data Processing in Social Networks using the Hadoop-Spark ecosystem	Antony Seabra, Sergio Lifschitz	2025-04-19	下载	This article explores the use of the Hadoop-Spark ecosystem for social media data processing, adopting a polyglot approach with the integration of various computation and storage technologies, such as...
DIP: Efficient Large Multimodal Model Training with Dynamic Interleaved Pipeline	Zhenliang Xue, Hanpeng Hu, Xing Chen, Yimin Jiang, Yixin Song, Zeyu Mi, Yibo Zhu, Daxin Jiang, Yubin Xia, Haibo Chen	2025-04-19	下载	Large multimodal models (LMMs) have demonstrated excellent capabilities in both understanding and generation tasks with various modalities. While these models can accept flexible combinations of input...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Planet as a Brain: Towards Internet of AgentSites based on AIOS Server	Xiang Zhang, Yongfeng Zhang	2025-04-19	下载	The internet is undergoing a historical transformation from the "Internet of Websites" to the "Internet of AgentSites." While traditional Websites served as the foundation for information hosting and ...
Diffusion-based Dynamic Contract for Federated AI Agent Construction in Mobile Metaverses	Jinbo Wen, Jiawen Kang, Yang Zhang, Yue Zhong, Dusit Niyato, Jie Xu, Jianhang Tang, Chau Yuen	2025-04-19	下载	Mobile metaverses are envisioned as a transformative digital ecosystem that delivers immersive, intelligent, and ubiquitous services through mobile devices.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Improving the Serving Performance of Multi-LoRA Large Language Models via Efficient LoRA and KV Cache Management	Hang Zhang, Jiuchen Shi, Yixiao Wang, Quan Chen, Yizhou Shan, Minyi Guo	2025-04-19	下载	Multiple Low-Rank Adapters (Multi-LoRAs) are gaining popularity for task-specific Large Language Model (LLM) applications. For multi-LoRA serving, caching hot KV caches and LoRA adapters in high bandw...