2025-05-28

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
CrossNAS: A Cross-Layer Neural Architecture Search Framework for PIM Systems	Md Hasibul Amin, Mohammadreza Mohammadi, Jason D. Bakos, Ramtin Zand	2025-05-28	下载	In this paper, we propose the CrossNAS framework, an automated approach for exploring a vast, multidimensional search space that spans various design abstraction layers-circuits, architecture, and sys...
GPU-Accelerated Simulated Oscillator Ising/Potts Machine Solving Combinatorial Optimization Problems	Yilmaz Ege Gonul, Ceyhun Efe Kayan, Ilknur Mustafazade, Nagarajan Kandasamy, Baris Taskin	2025-05-28	下载	Oscillator-based Ising machines (OIMs) and oscillator-based Potts machines (OPMs) have emerged as promising hardware accelerators for solving NP-hard combinatorial optimization problems by leveraging ...
Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning	Stef Cuyckens, Xiaoling Yi, Nitish Satya Murthy, Chao Fang, Marian Verhelst	2025-05-28	下载	Autonomous robots require efficient on-device learning to adapt to new environments without cloud dependency. For this edge training, Microscaling (MX) data types offer a promising solution by combini...
EStacker: Explaining Battery-Less IoT System Performance with Energy Stacks	Lukas Liedtke, Per Gunnar Kjeldsberg, Frank Alexander Kraemer, Magnus Jahre	2025-05-28	下载	The number of Internet of Things (IoT) devices is increasing exponentially, and it is environmentally and economically unsustainable to power all these devices with batteries.
Refining Datapath for Microscaling ViTs	Can Xiao, Jianyi Cheng, Aaron Zhao	2025-05-28	下载	Vision Transformers (ViTs) leverage the transformer architecture to effectively capture global context, demonstrating strong performance in computer vision tasks.
DeepRTL2: A Versatile Model for RTL-Related Tasks	Yi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran Xu, Qiang Xu	2025-05-28	下载	The integration of large language models (LLMs) into electronic design automation (EDA) has significantly advanced the field, offering transformative benefits, particularly in register transfer level ...
iDSE: Navigating Design Space Exploration in High-Level Synthesis Using LLMs	Runkai Li, Jia Xiong, Xi Wang	2025-05-28	下载	High-Level Synthesis (HLS) serves as an agile hardware development tool that streamlines the circuit design by abstracting the register transfer level into behavioral descriptions, while allowing desi...
FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design	Asal Mehradfar, Xuzhe Zhao, Yilun Huang, Emir Ceyani, Yankai Yang, Shihao Han, Hamidreza Aghasi, Salman Avestimehr	2025-05-28	下载	Designing analog circuits from performance specifications is a complex, multi-stage process encompassing topology selection, parameter inference, and layout feasibility.
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using $\mathbb{F}_2$	Keren Zhou, Mario Lezcano, Adam Goucher, Akhmed Rakhmati, Jeff Niu, Justin Lebar, Pawel Szczerbuk, Peter Bell, Phil Tillet, Thomas Raoux, Zahi Moudallal	2025-05-28	下载	Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts --...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials	Kevin Han, Bowen Deng, Amir Barati Farimani, Gerbrand Ceder	2025-05-28	下载	Large-scale atomistic simulations are essential to bridge computational materials and chemistry to realistic materials and drug discovery applications.
Profiling and optimization of multi-card GPU machine learning jobs	Marcin Lawenda, Kyrylo Khloponin, Krzesimir Samborski, Łukasz Szustak	2025-05-28	下载	The effectiveness and efficiency of machine learning methodologies are crucial, especially with respect to the quality of results and computational cost.
Visualizing Cloud-native Applications with KubeDiagrams	Philippe Merle, Fabio Petrillo	2025-05-28	下载	Modern distributed applications increasingly rely on cloud-native platforms to abstract the complexity of deployment and scalability. As the de facto orchestration standard, Kubernetes enables this ab...
The National Research Platform: Stretched, Multi-Tenant, Scientific Kubernetes Cluster	Derek Weitzel, Ashton Graves, Sam Albin, Huijun Zhu, Frank Würthwein, Mahidhar Tatineni, Dmitry Mishin, John Graham, Elham E Khoda, Mohammad Firas Sada, Larry Smarr, Thomas DeFanti	2025-05-28	下载	The National Research Platform (NRP) represents a distributed, multi-tenant Kubernetes-based cyberinfrastructure designed to facilitate collaborative scientific computing.
Smart Contracts for SMEs and Large Companies	C. G. Liu, P. Bodorik, D. Jutla	2025-05-28	下载	Research on blockchains addresses multiple issues, with one being writing smart contracts. In our previous research we described methodology and a tool to generate, in automated fashion, smart contrac...
Energy Profiling of Data-Sharing Pipelines: Modeling, Estimation, and Reuse Strategies	Sepideh Masoudi, Sebastian Werner, Pierluigi Plebani, Stefan Tai	2025-05-28	下载	Data-sharing pipelines involve a series of stages that apply policy-based data transformations to enable secure and effective data exchange among organizations.
Inclusive, Differentially Private Federated Learning for Clinical Data	Santhosh Parampottupadam, Melih Coşğun, Sarthak Pati, Maximilian Zenk, Saikat Roy, Dimitrios Bounias, Benjamin Hamm, Sinem Sav, Ralf Floca, Klaus Maier-Hein	2025-05-28	下载	Federated Learning (FL) offers a promising approach for training clinical AI models without centralizing sensitive patient data. However, its real-world adoption is hindered by challenges related to p...
Towards Efficient Key-Value Cache Management for Prefix Prefilling in LLM Inference	Yue Zhu, Hao Yu, Chen Wang, Zhuoran Liu, Eun Kyung Lee	2025-05-28	下载	The increasing adoption of large language models (LLMs) with extended context windows necessitates efficient Key-Value Cache (KVC) management to optimize inference performance.
Jointλ: Orchestrating Serverless Workflows on Jointcloud FaaS Systems	Rui Li, Jianfei Liu, Zhilin Yang, Peichang Shi, Guodong Yi, Huaimin Wang	2025-05-28	下载	Existing serverless workflow orchestration systems are predominantly designed for a single-cloud FaaS system, leading to vendor lock-in. This restricts performance optimization, cost reduction, and av...
Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated Learning	Hongyao Chen, Tianyang Xu, Xiaojun Wu, Josef Kittler	2025-05-28	下载	Batch Normalisation (BN) is widely used in conventional deep neural network training to harmonise the input-output distributions for each batch of data.
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using $\mathbb{F}_2$	Keren Zhou, Mario Lezcano, Adam Goucher, Akhmed Rakhmati, Jeff Niu, Justin Lebar, Pawel Szczerbuk, Peter Bell, Phil Tillet, Thomas Raoux, Zahi Moudallal	2025-05-28	下载	Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts --...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Frequency Resource Management in 6G User-Centric CFmMIMO: A Hybrid Reinforcement Learning and Metaheuristic Approach	Selina Cheggour, Valeria Loscri	2025-05-28	下载	As sixth-generation (6G) networks continue to evolve, AI-driven solutions are playing a crucial role in enabling more efficient and adaptive resource management in wireless communication.
Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge Environments	Jingxi Lu, Wenhao Li, Jianxiong Guo, Xingjian Ding, Zhiqing Tang, Tian Wang, Weijia Jia	2025-05-28	下载	With the rapid growth of IoT devices and their diverse workloads, container-based microservices deployed at edge nodes have become a lightweight and scalable solution.
Chain-of-Thought for Large Language Model-empowered Wireless Communications	Xudong Wang, Jian Zhu, Ruichen Zhang, Lei Feng, Dusit Niyato, Jiacheng Wang, Hongyang Du, Shiwen Mao, Zhu Han	2025-05-28	下载	Recent advances in large language models (LLMs) have opened new possibilities for automated reasoning and decision-making in wireless networks.
From Large AI Models to Agentic AI: A Tutorial on Future Intelligent Communications	Feibo Jiang, Cunhua Pan, Li Dong, Kezhi Wang, Octavia A. Dobre, Merouane Debbah	2025-05-28	下载	With the advent of 6G communications, intelligent communication systems face multiple challenges, including constrained perception and response capabilities, limited scalability, and low adaptability ...
Domainator: Detecting and Identifying DNS-Tunneling Malware Using Metadata Sequences	Denis Petrov, Pascal Ruffing, Sebastian Zillien, Steffen Wendzel	2025-05-28	下载	In recent years, malware with tunneling (or: covert channel) capabilities is on the rise. While malware research led to several methods and innovations, the detection and differentiation of malware so...
Real-World Modeling of Computation Offloading for Neural Networks with Early Exits and Splits	Jan Danek, Zdenek Becvar, Adam Janes	2025-05-28	下载	We focus on computation offloading of applications based on convolutional neural network (CNN) from moving devices, such as mobile robots or autonomous vehicles, to MultiAccess Edge Computing (MEC) se...
Streaming Remote rendering services: Comparison of QUIC-based and WebRTC Protocols	Daniel Mejías, Inhar Yeregui, Ángel Martín, Roberto Viola, Pablo Angueira, Jon Montalbán	2025-05-28	下载	The proliferation of Extended Reality (XR) applications, requiring high-quality, low-latency media streaming, has driven the demand for efficient remote rendering solutions.
Leveraging 5G Physical Layer Monitoring for Adaptive Remote Rendering in XR Applications	Inhar Yeregui, Daniel Mejías, Mikel Zorrilla, Roberto Viola, Jasone Astorga, Eduardo Jacob	2025-05-28	下载	As immersive eXtended Reality (XR) applications demand substantial network resources, understanding their interaction with 5G networks becomes crucial to improve them.
The Tri-Hybrid MIMO Architecture	Robert W. Heath,, Joseph Carlson, Nitish Vikas Deshpande, Miguel Rodrigo Castellanos, Mohamed Akrout, Chan-Byoung Chae	2025-05-28	下载	We present an evolution of multiple-input multiple-output (MIMO) wireless communications known as the tri-hybrid MIMO architecture. In this framework, the traditional operations of linear precoding at...
Distributionally Robust Wireless Semantic Communication with Large AI Models	Long Tan Le, Senura Hansaja Wanasekara, Zerun Niu, Nguyen H. Tran, Phuong Vo, Walid Saad, Dusit Niyato, Zhu Han, Choong Seon Hong, H. Vincent Poor	2025-05-28	下载	Semantic communication (SemCom) has emerged as a promising paradigm for 6G wireless systems by transmitting task-relevant information rather than raw bits, yet existing approaches remain vulnerable to...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic Potentials	Kevin Han, Bowen Deng, Amir Barati Farimani, Gerbrand Ceder	2025-05-28	下载	Large-scale atomistic simulations are essential to bridge computational materials and chemistry to realistic materials and drug discovery applications.
Profiling and optimization of multi-card GPU machine learning jobs	Marcin Lawenda, Kyrylo Khloponin, Krzesimir Samborski, Łukasz Szustak	2025-05-28	下载	The effectiveness and efficiency of machine learning methodologies are crucial, especially with respect to the quality of results and computational cost.
CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs	Mohamed R. Elshamy, Mehdi Elahi, Ahmad Patooghy, Abdel-Hameed A. Badawy	2025-05-28	下载	Efficient thermal and power management in modern multiprocessor systems-on-chip (MPSoCs) demands accurate power consumption estimation. One of the state-of-the-art approaches, Alternative Blind Power ...
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using $\mathbb{F}_2$	Keren Zhou, Mario Lezcano, Adam Goucher, Akhmed Rakhmati, Jeff Niu, Justin Lebar, Pawel Szczerbuk, Peter Bell, Phil Tillet, Thomas Raoux, Zahi Moudallal	2025-05-28	下载	Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts --...