2025-03-17

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
LIMCA: LLM for Automating Analog In-Memory Computing Architecture Design Exploration	Deepak Vungarala, Md Hasibul Amin, Pietro Mercati, Arnob Ghosh, Arman Roohi, Ramtin Zand, Shaahin Angizi	2025-03-17	下载	Resistive crossbars enabling analog In-Memory Computing (IMC) have emerged as a promising architecture for Deep Neural Network (DNN) acceleration, offering high memory bandwidth and in-situ computatio...
VeriLeaky: Navigating IP Protection vs Utility in Fine-Tuning for LLM-Driven Verilog Coding	Zeng Wang, Minghao Shao, Mohammed Nabeel, Prithwish Basu Roy, Likhitha Mankali, Jitendra Bhandari, Ramesh Karri, Ozgur Sinanoglu, Muhammad Shafique, Johann Knechtel	2025-03-17	下载	Large language models (LLMs) offer significant potential for coding, yet fine-tuning (FT) with curated data is essential for niche languages like Verilog.
VeriContaminated: Assessing LLM-Driven Verilog Coding for Data Contamination	Zeng Wang, Minghao Shao, Jitendra Bhandari, Likhitha Mankali, Ramesh Karri, Ozgur Sinanoglu, Muhammad Shafique, Johann Knechtel	2025-03-17	下载	Large Language Models (LLMs) have revolutionized code generation, achieving exceptional results on various established benchmarking frameworks.
Managing Hybrid Solid-State Drives Using Large Language Models	Qian Wei, Yi Li, Zehao Chen, Zhaoyan Shen, Dongxiao Yu, Bingzhe Li	2025-03-17	下载	Hybrid Solid-State Drives (SSDs), which integrate several types of flash cells (e.g., single-level cell (SLC) and multiple-level cell (MLC)) in a single drive and enable them to convert between each o...
HERMES: High-Performance RISC-V Memory Hierarchy for ML Workloads	Pranav Suryadevara	2025-03-17	下载	The growth of machine learning (ML) workloads has underscored the importance of efficient memory hierarchies to address bandwidth, latency, and scalability challenges.
ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM	Wenqiang Wang, Yijia Zhang, Zikai Zhang, Guanting Huo, Hao Liang, Shijie Cao, Ningyi Xu	2025-03-17	下载	As large language models (LLMs) demonstrate powerful capabilities, deploying them on edge devices has become increasingly crucial, offering advantages in privacy and real-time interaction.
Open3DBench: Open-Source Benchmark for 3D-IC Backend Implementation and PPA Evaluation	Yunqi Shi, Chengrui Gao, Wanqi Ren, Peng Xie, Siyuan Xu, Ke Xue, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou	2025-03-17	下载	This work introduces Open3DBench, an open-source 3D-IC backend implementation benchmark built upon the OpenROAD-flow-scripts framework, enabling comprehensive evaluation of power, performance, area, a...
SparseLUT: Sparse Connectivity Optimization for Lookup Table-based Deep Neural Networks	Binglei Lou, Ruilin Wu, Philip Leong	2025-03-17	下载	The deployment of deep neural networks (DNNs) on resource-constrained edge devices such as field-programmable gate arrays (FPGAs) requires a careful balance of latency, power, and resource usage while...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Do Large Language Models Understand Performance Optimization?	Bowen Cui, Tejas Ramesh, Oscar Hernandez, Keren Zhou	2025-03-17	下载	Large Language Models (LLMs) have emerged as powerful tools for software development tasks such as code completion, translation, and optimization.
Container late-binding in unprivileged dHTC pilot systems on Kubernetes resources	Igor Sfiligoi, Yunjin Zhu, Jaime Frey	2025-03-17	下载	The scientific and research community has benefited greatly from containerized distributed High Throughput Computing (dHTC), both by enabling elastic scaling of user compute workloads to thousands of ...
FedVSR: Towards Model-Agnostic Federated Learning in Video Super-Resolution	Ali Mollaahmadi Dehaghi, Hossein KhademSohi, Reza Razavi, Steve Drew, Mohammad Moshirpour	2025-03-17	下载	Video super-resolution (VSR) aims to enhance low-resolution videos by leveraging both spatial and temporal information. While deep learning has led to impressive progress, it typically requires centra...
Exploring the Potential of Carbon-Aware Execution for Scientific Workflows	Kathleen West, Fabian Lehmann, Vasilis Bountris, Ulf Leser, Yehia Elkhatib, Lauritz Thamsen	2025-03-17	下载	Scientific workflows are widely used to automate scientific data analysis and often involve processing large quantities of data on compute clusters.
Energy-Efficient and High-Performance Data Transfers with DRL Agents	Hasibul Jamil, Jacob Goldverg, Elvis Rodrigues, MD S Q Zulkar Nine, Tevfik Kosar	2025-03-17	下载	The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently.
XChainDataGen: A Cross-Chain Dataset Generation Framework	André Augusto, André Vasconcelos, Miguel Correia, Luyao Zhang	2025-03-17	下载	The number of blockchain interoperability protocols for transferring data and assets between blockchains has grown significantly. However, no open dataset of cross-chain transactions exists to study i...
SDFLMQ: A Semi-Decentralized Federated Learning Framework over MQTT	Amir Ali-Pour, Julien Gascon-Samson	2025-03-17	下载	Federated Learning is widely discussed as a distributed machine learning concept with stress on preserving data privacy. Various structures of Federated Learning were proposed.
Optimal Expert Selection for Distributed Mixture-of-Experts at the Wireless Edge	Shengling Qin, Hai Wu, Hongyang Du, Kaibin Huang	2025-03-17	下载	The emergence of distributed Mixture-of-Experts (DMoE) systems, which deploy expert models at edge nodes, offers a pathway to achieving connected intelligence in sixth-generation (6G) mobile networks ...
Scalable Runtime Architecture for Data-driven, Hybrid HPC and ML Workflow Applications	Andre Merzky, Mikhail Titov, Matteo Turilli, Ozgur Kilic, Tianle Wang, Shantenu Jha	2025-03-17	下载	Hybrid workflows combining traditional HPC and novel ML methodologies are transforming scientific computing. This paper presents the architecture and implementation of a scalable runtime system that e...
Generative AI for Software Architecture. Applications, Challenges, and Future Directions	Matteo Esposito, Xiaozhou Li, Sergio Moreschini, Noman Ahmad, Tomas Cerny, Karthik Vaidhyanathan, Valentina Lenarduzzi, Davide Taibi	2025-03-17	下载	Context: Generative Artificial Intelligence (GenAI) is transforming much of software development, yet its application in software architecture is still in its infancy, and no prior study has systemati...
Zero-Knowledge Proof-Based Consensus for Blockchain-Secured Federated Learning	Tianxing Fu, Jia Hu, Geyong Min, Zi Wang	2025-03-17	下载	Federated learning (FL) enables multiple participants to collaboratively train machine learning models while ensuring their data remains private and secure.
GC-Fed: Gradient Centralized Federated Learning with Partial Client Participation	Jungwon Seo, Ferhat Ozgur Catak, Chunming Rong, Kibeom Hong, Minhoe Kim	2025-03-17	下载	Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but is challenged by client drift in highly heterogeneous data settings.
ILVES: Accurate and efficient bond length and angle constraints in molecular dynamics	Lorién López-Villellas, Carl Christian Kjelgaard Mikkelsen, Juan José Galano-Frutos, Santiago Marco-Sola, Jesús Alastruey-Benedé, Pablo Ibáñez, Pablo Echenique, Miquel Moretó, Maria Cristina De Rosa, Pablo García-Risueño	2025-03-17	下载	All-atom, force field-based molecular dynamics simulations are essential tools in computational chemistry, enabling the prediction and analysis of biomolecular systems with atomic-level resolution.
WOW: Workflow-Aware Data Movement and Task Scheduling for Dynamic Scientific Workflows	Fabian Lehmann, Jonathan Bader, Friedrich Tschirpke, Ninon De Mecquenem, Ansgar Lößer, Soeren Becker, Katarzyna Ewa Lewińska, Lauritz Thamsen, Ulf Leser	2025-03-17	下载	Scientific workflows process extensive data sets over clusters of independent nodes, which requires a complex stack of infrastructure components, especially a resource manager (RM) for task-to-node as...
Byzantine-Tolerant Consensus in GPU-Inspired Shared Memory	Chryssis Georgiou, Manaswini Piduguralla, Sathya Peri	2025-03-17	下载	In this work, we formalize a novel shared memory model inspired by the popular GPU architecture. Within this model, we develop algorithmic solutions to the Byzantine Consensus problem and analyze thei...
Understanding the Communication Needs of Asynchronous Many-Task Systems -- A Case Study of HPX+LCI	Jiakun Yan, Hartmut Kaiser, Marc Snir	2025-03-17	下载	Asynchronous Many-Task (AMT) systems offer a potential solution for efficiently programming complicated scientific applications on extreme-scale heterogeneous architectures.
WRATH: Workload Resilience Across Task Hierarchies in Task-based Parallel Programming Frameworks	Sicheng Zhou, Zhuozhao Li, Valérie Hayot-Sasson, Haochen Pan, Maxime Gonthier, J. Gregory Pauloski, Ryan Chard, Kyle Chard, Ian Foster	2025-03-17	下载	Failures in Task-based Parallel Programming (TBPP) can severely degrade performance and result in incomplete or incorrect outcomes. Existing failure-handling approaches, including reactive, proactive,...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Compact routing schemes in undirected and directed graphs	Avi Kadria, Liam Roditty	2025-03-17	下载	In this paper, we study the problem of compact routing schemes in weighted undirected and directed graphs. \textit{For weighted undirected graphs}, more than a decade ago, Chechik [PODC'13] presente...
Towards Energy- and QoS-aware Load Balancing for 6G: Leveraging O-RAN to Achieve Sustainable and Energy-Efficient 6G	Gustavo Z. Bruno, Gabriel M. Almeida, Aloizio Da Silva, Luiz A. DaSilva, Joao F. Santos, Alexandre Huff, Kleber V. Cardoso, Cristiano B. Both	2025-03-17	下载	This paper addresses the critical challenge posed by the increasing energy consumption in mobile networks, particularly with the advent of Sixth Generation (6G) technologies.
Energy-Efficient and High-Performance Data Transfers with DRL Agents	Hasibul Jamil, Jacob Goldverg, Elvis Rodrigues, MD S Q Zulkar Nine, Tevfik Kosar	2025-03-17	下载	The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently.
Toward Generative 6G Simulation: An Experimental Multi-Agent LLM and ns-3 Integration	Farhad Rezazadeh, Amir Ashtari Gargari, Sandra Lagen, Houbing Song, Dusit Niyato, Lingjia Liu	2025-03-17	下载	The move toward open Sixth-Generation (6G) networks necessitates a novel approach to full-stack simulation environments for evaluating complex technology developments before prototyping and real-world...
Simulating Raman Scattering Impairments with Depolarization Noise in Quantum-Classical Links	Jake Smith, Roberto Proietti	2025-03-17	下载	We model spontaneous Raman scattering noise in polarization-encoded quantum communication channels co-propagating with classical signals using the depolarization channel.
Enable Time-Sensitive Applications in Kubernetes with Container Network Interface Plugin Agnostic Metadata Proxy	Ferenc Orosi, Ferenc Fejes	2025-03-17	下载	Application deployment in cloud environment is dominated by Kubernetes-orchestrated microservices. Provides a secure environment, networking, storage, isolation, scheduling, and many other abstraction...
A Reference Architecture for Autonomous Networks: An Agent-Based Approach	Joseph Sifakis, Dongming Li, Hairong Huang, Yong Zhang, Wenshuan Dang, River Huang, Yijun Yu	2025-03-17	下载	The vision of autonomous systems is becoming increasingly important in many application areas, where the aim is to replace humans with agents.
SafeSlice: Enabling SLA-Compliant O-RAN Slicing via Safe Deep Reinforcement Learning	Ahmad M. Nagib, Hatem Abou-Zeid, Hossam S. Hassanein	2025-03-17	下载	Deep reinforcement learning (DRL)-based slicing policies have shown significant success in simulated environments but face challenges in physical systems such as open radio access networks (O-RANs) du...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
PrETi: Predicting Execution Time in Early Stage with LLVM and Machine Learning	Risheng Xu, Philipp Sieweck, Hermann von Hasseln, Dirk Nowotka	2025-03-17	下载	We introduce preti, a novel framework for predicting software execution time during the early stages of development. preti leverages an LLVM-based simulation environment to extract timing-related runt...
Energy-Efficient and High-Performance Data Transfers with DRL Agents	Hasibul Jamil, Jacob Goldverg, Elvis Rodrigues, MD S Q Zulkar Nine, Tevfik Kosar	2025-03-17	下载	The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently.
HERMES: High-Performance RISC-V Memory Hierarchy for ML Workloads	Pranav Suryadevara	2025-03-17	下载	The growth of machine learning (ML) workloads has underscored the importance of efficient memory hierarchies to address bandwidth, latency, and scalability challenges.