Skip to content

2025-06-03

cs.AR - Architecture

标题作者发布日期PDF摘要
Large Processor Chip ModelKaiyan Chang, Mingzhi Chen, Yunji Chen, Zhirong Chen, Dongrui Fan, Junfeng Gong, Nan Guo, Yinhe Han, Qinfen Hao, Shuo Hou, Xuan Huang, Pengwei Jin, Changxin Ke, Cangyuan Li, Guangli Li, Huawei Li, Kuan Li, Naipeng Li, Shengwen Liang, Cheng Liu, Hongwei Liu, Jiahua Liu, Junliang Lv, Jianan Mu, Jin Qin, Bin Sun, Chenxi Wang, Duo Wang, Mingjun Wang, Ying Wang, Chenggang Wu, Peiyang Wu, Teng Wu, Xiao Xiao, Mengyao Xie, Chenwei Xiong, Ruiyuan Xu, Mingyu Yan, Xiaochun Ye, Kuai Yu, Rui Zhang, Shuoming Zhang, Jiacheng Zhao2025-06-03下载Computer System Architecture serves as a crucial bridge between software applications and the underlying hardware, encompassing components like compilers, CPUs, coprocessors, and RTL designs.
CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the EdgeChunlin Tian, Xinpeng Qin, Kahou Tam, Li Li, Zijian Wang, Yuanzhe Zhao, Minglei Zhang, Chengzhong Xu2025-06-03下载Deploying large language models (LLMs) on edge devices is crucial for delivering fast responses and ensuring data privacy. However, the limited storage, weight, and power of edge devices make it diffi...
CPU-Based Layout Design for Picker-to-Parts Pallet WarehousesTimo Looms, Lin Xie2025-06-03下载Picker-to-parts pallet warehouses often face inefficiencies due to conventional layouts causing excessive travel distances and high labor requirements.
Hardware-Centric Analysis of DeepSeek's Multi-Head Latent AttentionRobin Geens, Marian Verhelst2025-06-03下载Multi-Head Latent Attention (MLA), introduced in DeepSeek-V2, improves the efficiency of large language models by projecting query, key, and value tensors into a compact latent space.
Memory Access Vectors: Improving Sampling Fidelity for CPU Performance SimulationsSriyash Caculo, Mahesh Madhav, Jeff Baxter2025-06-03下载Accurate performance projection of large-scale benchmarks is essential for CPU architects to evaluate and optimize future processor designs. SimPoint sampling, which uses Basic Block Vectors (BBVs), i...
Minimal Neuron Circuits -- Part I: ResonatorsAmr Nabil, T. Nandha Kumar, Haider Abbas F. Almurib2025-06-03下载Spiking Neural Networks have earned increased recognition in recent years owing to their biological plausibility and event-driven computation.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
The Cloud Next Door: Investigating the Environmental and Socioeconomic Strain of Datacenters on Local CommunitiesWacuka Ngata, Noman Bashir, Michelle Westerlaken, Laurent Liote, Yasra Chandio, Elsa Olivetti2025-06-03下载Datacenters have become the backbone of modern digital infrastructure, powering the rapid rise of artificial intelligence and promising economic growth and technological progress.
Relay Selection and User Equipment Admission in Resource-Efficient NextG Sidelink CommunicationsYalin E. Sagduyu, Tugba Erpek, Sastry Kompella, Kemal Davaslioglu2025-06-03下载5G/6G sidelink communications addresses the challenge of connecting outer UEs, which are unable to directly access a base station (gNodeB), through inner UEs that act as relays to connect to the gNode...
APEX: Asynchronous Parallel CPU-GPU Execution for Online LLM Inference on Constrained GPUsJiakun Fan, Yanglin Zhang, Xiangchen Li, Dimitrios S. Nikolopoulos2025-06-03下载Deploying large language models (LLMs) for online inference is often constrained by limited GPU memory, particularly due to the growing KV cache during auto-regressive decoding.
GPU-Parallelizable Randomized Sketch-and-Precondition for Linear Regression using Sparse Sign SketchesTyler Chen, Pradeep Niroula, Archan Ray, Pragna Subrahmanya, Marco Pistoia, Niraj Kumar2025-06-03下载A litany of theoretical and numerical results have established the sketch-and-precondition paradigm as a powerful approach to solving large linear regression problems in standard computing environment...
Dynamic Fee for Reducing Impermanent Loss in Decentralized ExchangesIrina Lebedeva, Dmitrii Umnov, Yury Yanovich, Ignat Melnikov, George Ovchinnikov2025-06-03下载Decentralized exchanges (DEXs) are crucial to decentralized finance (DeFi) as they enable trading without intermediaries. However, they face challenges like impermanent loss (IL), where liquidity prov...
Memory-Efficient Split Federated Learning for LLM Fine-Tuning on Heterogeneous Mobile DevicesXiaopei Chen, Liang Li, Fei Ji, Wen Wu2025-06-03下载In this paper, we propose an edge-assisted split federated learning framework to facilitate large language model (LLM) fine-tuning on heterogeneous mobile devices while alleviating memory pressures on...
Overcoming Challenges of Partial Client Participation in Federated Learning : A Comprehensive ReviewMrinmay Sen, Shruti Aparna, Rohit Agarwal, Chalavadi Krishna Mohan2025-06-03下载Federated Learning (FL) is a learning mechanism that falls under the distributed training umbrella, which collaboratively trains a shared global model without disclosing the raw data from different cl...
Process Mining on Distributed Data SourcesMaximilian Weisenseel, Julia Andersen, Samira Akili, Christian Imenkamp, Hendrik Reiter, Christoffer Rubensson, Wilhelm Hasselbring, Olaf Landsiedel, Xixi Lu, Jan Mendling, Florian Tschorsch, Matthias Weidlich, Agnes Koschmider2025-06-03下载Major domains such as logistics, healthcare, and smart cities increasingly rely on sensor technologies and distributed infrastructures to monitor complex processes in real time.
From Local Updates to Global Balance: A Framework for Distributed Matrix ScalingGiacomo Aletti, Giovanni Naldi2025-06-03下载This paper investigates matrix scaling processes in the context of local normalization algorithms and their convergence behavior. Starting from the classical Sinkhorn algorithm, the authors introduce ...
Adaptive Configuration Selection for Multi-Model Inference Pipelines in Edge ComputingJinhao Sheng, Zhiqing Tang, Jianxiong Guo, Tian Wang2025-06-03下载The growing demand for real-time processing tasks is driving the need for multi-model inference pipelines on edge devices. However, cost-effectively deploying these pipelines while optimizing Quality ...
Exploring metrics for analyzing dynamic behavior in MPI programs via a coupled-oscillator modelAyesha Afzal, Georg Hager, Gerhard Wellen2025-06-03下载We propose a novel, lightweight, and physically inspired approach to modeling the dynamics of parallel distributed-memory programs. Inspired by the Kuramoto model, we represent MPI processes as couple...
Rethinking Dynamic Networks and Heterogeneous Computing with Automatic ParallelizationRuilong Wu, Xinjiao Li, Yisu Wang, Xinyu Chen, Dirk Kutscher2025-06-03下载Hybrid parallelism techniques are essential for efficiently training large language models (LLMs). Nevertheless, current automatic parallel planning frameworks often overlook the simultaneous consider...
Usability Evaluation of Cloud for HPC ApplicationsVanessa Sochat, Daniel Milroy, Abhik Sarkar, Aniruddha Marathe, Tapasya Patki2025-06-03下载The rise of AI and the economic dominance of cloud computing have created a new nexus of innovation for high performance computing (HPC), which has a long history of driving scientific discovery.
KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud ProviderJiahao Wang, Jinbo Han, Xingda Wei, Sijie Shen, Dingyan Zhang, Chenguang Fang, Rong Chen, Wenyuan Yu, Haibo Chen2025-06-03下载Serving large language models (LLMs) is important for cloud providers, and caching intermediate results (KV$) after processing each request substantially improves serving throughput and latency.
Distributedness based schedulingParitosh Ranjan, Surajit Majumder, Prodip Roy, Bhuban Padhan2025-06-03下载Efficient utilization of computing resources in a Kubernetes cluster is often constrained by the uneven distribution of pods with similar usage patterns.
Simplifying Root Cause Analysis in Kubernetes with StateGraph and LLMYong Xiang, Charley Peter Chen, Liyi Zeng, Wei Yin, Xin Liu, Hu Li, Wei Xu2025-06-03下载Kubernetes, a notably complex and distributed system, utilizes an array of controllers to uphold cluster management logic through state reconciliation.
DiOMP-Offloading: Toward Portable Distributed Heterogeneous OpenMPBaodi Shan, Mauricio Araya-Polo, Barbara Chapman2025-06-03下载As core counts and heterogeneity rise in HPC, traditional hybrid programming models face challenges in managing distributed GPU memory and ensuring portability.
Enhancing Convergence, Privacy and Fairness for Wireless Personalized Federated Learning: Quantization-Assisted Min-Max Fair SchedulingXiyu Zhao, Qimei Cui, Ziqiang Du, Weicai Li, Xi Yu, Wei Ni, Ji Zhang, Xiaofeng Tao, Ping Zhang2025-06-03下载Personalized federated learning (PFL) offers a solution to balancing personalization and generalization by conducting federated learning (FL) to guide personalized learning (PL).
Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order OptimizationZhe Li, Bicheng Ying, Zidong Liu, Chaosheng Dong, Haibo Yang2025-06-03下载Zeroth-order (ZO) optimization enables dimension-free communication in federated learning (FL), making it attractive for fine-tuning of large language models (LLMs) due to significant communication sa...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Relay Selection and User Equipment Admission in Resource-Efficient NextG Sidelink CommunicationsYalin E. Sagduyu, Tugba Erpek, Sastry Kompella, Kemal Davaslioglu2025-06-03下载5G/6G sidelink communications addresses the challenge of connecting outer UEs, which are unable to directly access a base station (gNodeB), through inner UEs that act as relays to connect to the gNode...
AI-Augmented OTDR Fault Localization Framework for Resilient Rural Fiber Networks in the United StatesSabab Al Farabi2025-06-03下载This research presents a novel framework that combines traditional Optical Time-Domain Reflectometer (OTDR) signal analysis with machine learning to localize and classify fiber optic faults in rural b...
Computation- and Communication-Efficient Online FL for Resource-Constrained Aerial VehiclesFerdous Pervej, Richeng Jin, Md Moin Uddin Chowdhury, Simran Singh, İsmail Güvenç, Huaiyu Dai2025-06-03下载Privacy-preserving distributed machine learning (ML) and aerial connected vehicle (ACV)-assisted edge computing have drawn significant attention lately.
Quantum Data Centres: Why Entanglement Changes EverythingAngela Sara Cacciapuoti, Claudio Pellitteri, Jessica Illiano, Laura d'Avossa, Francesco Mazza, Siyi Chen, Marcello Caleffi2025-06-03下载The Quantum Internet is key for distributed quantum computing, by interconnecting multiple quantum processors into a virtual quantum computation system.
NetArena: Dynamic Benchmarks for AI Agents in Network AutomationYajie Zhou, Jiajun Ruan, Eric S. Wang, Sadjad Fouladi, Francis Y. Yan, Kevin Hsieh, Zaoxing Liu2025-06-03下载As AI agents expand into high-stakes domains like network system operations, evaluating their real-world reliability becomes increasingly critical.
AI-Driven Vehicle Condition Monitoring with Cell-Aware Edge Service MigrationCharalampos Kalalas, Pavol Mulinka, Guillermo Candela Belmonte, Miguel Fornell, Michail Dalgitsis, Francisco Paredes Vera, Javier Santaella Sánchez, Carmen Vicente Villares, Roshan Sedar, Eftychia Datsika, Angelos Antonopoulos, Antonio Fernández Ojea, Miquel Payaro2025-06-03下载Artificial intelligence (AI) has been increasingly applied to the condition monitoring of vehicular equipment, aiming to enhance maintenance strategies, reduce costs, and improve safety.
HARNode: A Time-Synchronised, Open-Source, Multi-Device, Wearable System for Ad Hoc Field StudiesPhilipp Lepold, Tobias Röddiger, Michael Beigl2025-06-03下载Human activity recognition (HAR) research often lacks accessible, comprehensive field data. Commercial systems are rarely open source, hard to expand, and limited by issues like node synchronisation, ...
Zero-Energy RIS-Assisted Communications With Noise Modulation and Interference-Based Energy HarvestingAhmad Massud Tota Khel, Aissa Ikhlef, Zhiguo Ding, Hongjian Sun2025-06-03下载To advance towards carbon-neutrality and improve the limited {performance} of conventional passive wireless communications, in this paper, we investigate the integration of noise modulation with zero-...
Channel-adaptive Cross-modal Generative Semantic Communication for Point Cloud TransmissionWanting Yang, Zehui Xiong, Qianqian Yang, Ping Zhang, Merouane Debbah, Rahim Tafazolli2025-06-03下载With the rapid development of autonomous driving and extended reality, efficient transmission of point clouds (PCs) has become increasingly important.

cs.PF - Performance

标题作者发布日期PDF摘要
Energy Efficiency Analysis of Active RIS-enhanced Wireless Network under Power-Sum ConstraintJingdie Xin, Yan Wang, Feng Shu, Feng Zhao, Yifan Zhao, Hao Jiang2025-06-03下载Recently, as a green wireless technology, active reconfigurable intelligent surface (RIS) attracts numerous research activities due to its amplifying ability to combat the double-fading effect compare...
Usability Evaluation of Cloud for HPC ApplicationsVanessa Sochat, Daniel Milroy, Abhik Sarkar, Aniruddha Marathe, Tapasya Patki2025-06-03下载The rise of AI and the economic dominance of cloud computing have created a new nexus of innovation for high performance computing (HPC), which has a long history of driving scientific discovery.
Spatially Correlated multi-RIS Communication: The Effect of Inter-Operator InterferenceNikolaos I. Miridakis, Panagiotis A. Karkazis2025-06-03下载A multi-operator wireless communication system is studied where each operator is equipped with a reconfigurable intelligent surface (RIS) to enhance its communication quality.

基于 VitePress 构建