Skip to content

2025-05-28

cs.AR - Architecture

标题作者发布日期PDF摘要
CrossNAS: A Cross-Layer Neural Architecture Search Framework for PIM SystemsMd Hasibul Amin, Mohammadreza Mohammadi, Jason D. Bakos, Ramtin Zand2025-05-28下载In this paper, we propose the CrossNAS framework, an automated approach for exploring a vast, multidimensional search space that spans various design abstraction layers-circuits, architecture, and sys...
GPU-Accelerated Simulated Oscillator Ising/Potts Machine Solving Combinatorial Optimization ProblemsYilmaz Ege Gonul, Ceyhun Efe Kayan, Ilknur Mustafazade, Nagarajan Kandasamy, Baris Taskin2025-05-28下载Oscillator-based Ising machines (OIMs) and oscillator-based Potts machines (OPMs) have emerged as promising hardware accelerators for solving NP-hard combinatorial optimization problems by leveraging ...
Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics LearningStef Cuyckens, Xiaoling Yi, Nitish Satya Murthy, Chao Fang, Marian Verhelst2025-05-28下载Autonomous robots require efficient on-device learning to adapt to new environments without cloud dependency. For this edge training, Microscaling (MX) data types offer a promising solution by combini...
EStacker: Explaining Battery-Less IoT System Performance with Energy StacksLukas Liedtke, Per Gunnar Kjeldsberg, Frank Alexander Kraemer, Magnus Jahre2025-05-28下载The number of Internet of Things (IoT) devices is increasing exponentially, and it is environmentally and economically unsustainable to power all these devices with batteries.
Refining Datapath for Microscaling ViTsCan Xiao, Jianyi Cheng, Aaron Zhao2025-05-28下载Vision Transformers (ViTs) leverage the transformer architecture to effectively capture global context, demonstrating strong performance in computer vision tasks.
DeepRTL2: A Versatile Model for RTL-Related TasksYi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran Xu, Qiang Xu2025-05-28下载The integration of large language models (LLMs) into electronic design automation (EDA) has significantly advanced the field, offering transformative benefits, particularly in register transfer level ...
iDSE: Navigating Design Space Exploration in High-Level Synthesis Using LLMsRunkai Li, Jia Xiong, Xi Wang2025-05-28下载High-Level Synthesis (HLS) serves as an agile hardware development tool that streamlines the circuit design by abstracting the register transfer level into behavioral descriptions, while allowing desi...
FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit DesignAsal Mehradfar, Xuzhe Zhao, Yilun Huang, Emir Ceyani, Yankai Yang, Shihao Han, Hamidreza Aghasi, Salman Avestimehr2025-05-28下载Designing analog circuits from performance specifications is a complex, multi-stage process encompassing topology selection, parameter inference, and layout feasibility.
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using F2\mathbb{F}_2Keren Zhou, Mario Lezcano, Adam Goucher, Akhmed Rakhmati, Jeff Niu, Justin Lebar, Pawel Szczerbuk, Peter Bell, Phil Tillet, Thomas Raoux, Zahi Moudallal2025-05-28下载Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts --...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic PotentialsKevin Han, Bowen Deng, Amir Barati Farimani, Gerbrand Ceder2025-05-28下载Large-scale atomistic simulations are essential to bridge computational materials and chemistry to realistic materials and drug discovery applications.
Profiling and optimization of multi-card GPU machine learning jobsMarcin Lawenda, Kyrylo Khloponin, Krzesimir Samborski, Łukasz Szustak2025-05-28下载The effectiveness and efficiency of machine learning methodologies are crucial, especially with respect to the quality of results and computational cost.
Visualizing Cloud-native Applications with KubeDiagramsPhilippe Merle, Fabio Petrillo2025-05-28下载Modern distributed applications increasingly rely on cloud-native platforms to abstract the complexity of deployment and scalability. As the de facto orchestration standard, Kubernetes enables this ab...
The National Research Platform: Stretched, Multi-Tenant, Scientific Kubernetes ClusterDerek Weitzel, Ashton Graves, Sam Albin, Huijun Zhu, Frank Würthwein, Mahidhar Tatineni, Dmitry Mishin, John Graham, Elham E Khoda, Mohammad Firas Sada, Larry Smarr, Thomas DeFanti2025-05-28下载The National Research Platform (NRP) represents a distributed, multi-tenant Kubernetes-based cyberinfrastructure designed to facilitate collaborative scientific computing.
Smart Contracts for SMEs and Large CompaniesC. G. Liu, P. Bodorik, D. Jutla2025-05-28下载Research on blockchains addresses multiple issues, with one being writing smart contracts. In our previous research we described methodology and a tool to generate, in automated fashion, smart contrac...
Energy Profiling of Data-Sharing Pipelines: Modeling, Estimation, and Reuse StrategiesSepideh Masoudi, Sebastian Werner, Pierluigi Plebani, Stefan Tai2025-05-28下载Data-sharing pipelines involve a series of stages that apply policy-based data transformations to enable secure and effective data exchange among organizations.
Inclusive, Differentially Private Federated Learning for Clinical DataSanthosh Parampottupadam, Melih Coşğun, Sarthak Pati, Maximilian Zenk, Saikat Roy, Dimitrios Bounias, Benjamin Hamm, Sinem Sav, Ralf Floca, Klaus Maier-Hein2025-05-28下载Federated Learning (FL) offers a promising approach for training clinical AI models without centralizing sensitive patient data. However, its real-world adoption is hindered by challenges related to p...
Towards Efficient Key-Value Cache Management for Prefix Prefilling in LLM InferenceYue Zhu, Hao Yu, Chen Wang, Zhuoran Liu, Eun Kyung Lee2025-05-28下载The increasing adoption of large language models (LLMs) with extended context windows necessitates efficient Key-Value Cache (KVC) management to optimize inference performance.
Jointλ: Orchestrating Serverless Workflows on Jointcloud FaaS SystemsRui Li, Jianfei Liu, Zhilin Yang, Peichang Shi, Guodong Yi, Huaimin Wang2025-05-28下载Existing serverless workflow orchestration systems are predominantly designed for a single-cloud FaaS system, leading to vendor lock-in. This restricts performance optimization, cost reduction, and av...
Hybrid Batch Normalisation: Resolving the Dilemma of Batch Normalisation in Federated LearningHongyao Chen, Tianyang Xu, Xiaojun Wu, Josef Kittler2025-05-28下载Batch Normalisation (BN) is widely used in conventional deep neural network training to harmonise the input-output distributions for each batch of data.
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using F2\mathbb{F}_2Keren Zhou, Mario Lezcano, Adam Goucher, Akhmed Rakhmati, Jeff Niu, Justin Lebar, Pawel Szczerbuk, Peter Bell, Phil Tillet, Thomas Raoux, Zahi Moudallal2025-05-28下载Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts --...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Frequency Resource Management in 6G User-Centric CFmMIMO: A Hybrid Reinforcement Learning and Metaheuristic ApproachSelina Cheggour, Valeria Loscri2025-05-28下载As sixth-generation (6G) networks continue to evolve, AI-driven solutions are playing a crucial role in enabling more efficient and adaptive resource management in wireless communication.
Hybrid Learning for Cold-Start-Aware Microservice Scheduling in Dynamic Edge EnvironmentsJingxi Lu, Wenhao Li, Jianxiong Guo, Xingjian Ding, Zhiqing Tang, Tian Wang, Weijia Jia2025-05-28下载With the rapid growth of IoT devices and their diverse workloads, container-based microservices deployed at edge nodes have become a lightweight and scalable solution.
Chain-of-Thought for Large Language Model-empowered Wireless CommunicationsXudong Wang, Jian Zhu, Ruichen Zhang, Lei Feng, Dusit Niyato, Jiacheng Wang, Hongyang Du, Shiwen Mao, Zhu Han2025-05-28下载Recent advances in large language models (LLMs) have opened new possibilities for automated reasoning and decision-making in wireless networks.
From Large AI Models to Agentic AI: A Tutorial on Future Intelligent CommunicationsFeibo Jiang, Cunhua Pan, Li Dong, Kezhi Wang, Octavia A. Dobre, Merouane Debbah2025-05-28下载With the advent of 6G communications, intelligent communication systems face multiple challenges, including constrained perception and response capabilities, limited scalability, and low adaptability ...
Domainator: Detecting and Identifying DNS-Tunneling Malware Using Metadata SequencesDenis Petrov, Pascal Ruffing, Sebastian Zillien, Steffen Wendzel2025-05-28下载In recent years, malware with tunneling (or: covert channel) capabilities is on the rise. While malware research led to several methods and innovations, the detection and differentiation of malware so...
Real-World Modeling of Computation Offloading for Neural Networks with Early Exits and SplitsJan Danek, Zdenek Becvar, Adam Janes2025-05-28下载We focus on computation offloading of applications based on convolutional neural network (CNN) from moving devices, such as mobile robots or autonomous vehicles, to MultiAccess Edge Computing (MEC) se...
Streaming Remote rendering services: Comparison of QUIC-based and WebRTC ProtocolsDaniel Mejías, Inhar Yeregui, Ángel Martín, Roberto Viola, Pablo Angueira, Jon Montalbán2025-05-28下载The proliferation of Extended Reality (XR) applications, requiring high-quality, low-latency media streaming, has driven the demand for efficient remote rendering solutions.
Leveraging 5G Physical Layer Monitoring for Adaptive Remote Rendering in XR ApplicationsInhar Yeregui, Daniel Mejías, Mikel Zorrilla, Roberto Viola, Jasone Astorga, Eduardo Jacob2025-05-28下载As immersive eXtended Reality (XR) applications demand substantial network resources, understanding their interaction with 5G networks becomes crucial to improve them.
The Tri-Hybrid MIMO ArchitectureRobert W. Heath,, Joseph Carlson, Nitish Vikas Deshpande, Miguel Rodrigo Castellanos, Mohamed Akrout, Chan-Byoung Chae2025-05-28下载We present an evolution of multiple-input multiple-output (MIMO) wireless communications known as the tri-hybrid MIMO architecture. In this framework, the traditional operations of linear precoding at...
Distributionally Robust Wireless Semantic Communication with Large AI ModelsLong Tan Le, Senura Hansaja Wanasekara, Zerun Niu, Nguyen H. Tran, Phuong Vo, Walid Saad, Dusit Niyato, Zhu Han, Choong Seon Hong, H. Vincent Poor2025-05-28下载Semantic communication (SemCom) has emerged as a promising paradigm for 6G wireless systems by transmitting task-relevant information rather than raw bits, yet existing approaches remain vulnerable to...

cs.PF - Performance

标题作者发布日期PDF摘要
DistMLIP: A Distributed Inference Platform for Machine Learning Interatomic PotentialsKevin Han, Bowen Deng, Amir Barati Farimani, Gerbrand Ceder2025-05-28下载Large-scale atomistic simulations are essential to bridge computational materials and chemistry to realistic materials and drug discovery applications.
Profiling and optimization of multi-card GPU machine learning jobsMarcin Lawenda, Kyrylo Khloponin, Krzesimir Samborski, Łukasz Szustak2025-05-28下载The effectiveness and efficiency of machine learning methodologies are crucial, especially with respect to the quality of results and computational cost.
CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCsMohamed R. Elshamy, Mehdi Elahi, Ahmad Patooghy, Abdel-Hameed A. Badawy2025-05-28下载Efficient thermal and power management in modern multiprocessor systems-on-chip (MPSoCs) demands accurate power consumption estimation. One of the state-of-the-art approaches, Alternative Blind Power ...
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using F2\mathbb{F}_2Keren Zhou, Mario Lezcano, Adam Goucher, Akhmed Rakhmati, Jeff Niu, Justin Lebar, Pawel Szczerbuk, Peter Bell, Phil Tillet, Thomas Raoux, Zahi Moudallal2025-05-28下载Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts --...

基于 VitePress 构建