Skip to content

2025-03-17

cs.AR - Architecture

标题作者发布日期PDF摘要
LIMCA: LLM for Automating Analog In-Memory Computing Architecture Design ExplorationDeepak Vungarala, Md Hasibul Amin, Pietro Mercati, Arnob Ghosh, Arman Roohi, Ramtin Zand, Shaahin Angizi2025-03-17下载Resistive crossbars enabling analog In-Memory Computing (IMC) have emerged as a promising architecture for Deep Neural Network (DNN) acceleration, offering high memory bandwidth and in-situ computatio...
VeriLeaky: Navigating IP Protection vs Utility in Fine-Tuning for LLM-Driven Verilog CodingZeng Wang, Minghao Shao, Mohammed Nabeel, Prithwish Basu Roy, Likhitha Mankali, Jitendra Bhandari, Ramesh Karri, Ozgur Sinanoglu, Muhammad Shafique, Johann Knechtel2025-03-17下载Large language models (LLMs) offer significant potential for coding, yet fine-tuning (FT) with curated data is essential for niche languages like Verilog.
VeriContaminated: Assessing LLM-Driven Verilog Coding for Data ContaminationZeng Wang, Minghao Shao, Jitendra Bhandari, Likhitha Mankali, Ramesh Karri, Ozgur Sinanoglu, Muhammad Shafique, Johann Knechtel2025-03-17下载Large Language Models (LLMs) have revolutionized code generation, achieving exceptional results on various established benchmarking frameworks.
Managing Hybrid Solid-State Drives Using Large Language ModelsQian Wei, Yi Li, Zehao Chen, Zhaoyan Shen, Dongxiao Yu, Bingzhe Li2025-03-17下载Hybrid Solid-State Drives (SSDs), which integrate several types of flash cells (e.g., single-level cell (SLC) and multiple-level cell (MLC)) in a single drive and enable them to convert between each o...
HERMES: High-Performance RISC-V Memory Hierarchy for ML WorkloadsPranav Suryadevara2025-03-17下载The growth of machine learning (ML) workloads has underscored the importance of efficient memory hierarchies to address bandwidth, latency, and scalability challenges.
ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLMWenqiang Wang, Yijia Zhang, Zikai Zhang, Guanting Huo, Hao Liang, Shijie Cao, Ningyi Xu2025-03-17下载As large language models (LLMs) demonstrate powerful capabilities, deploying them on edge devices has become increasingly crucial, offering advantages in privacy and real-time interaction.
Open3DBench: Open-Source Benchmark for 3D-IC Backend Implementation and PPA EvaluationYunqi Shi, Chengrui Gao, Wanqi Ren, Peng Xie, Siyuan Xu, Ke Xue, Mingxuan Yuan, Chao Qian, Zhi-Hua Zhou2025-03-17下载This work introduces Open3DBench, an open-source 3D-IC backend implementation benchmark built upon the OpenROAD-flow-scripts framework, enabling comprehensive evaluation of power, performance, area, a...
SparseLUT: Sparse Connectivity Optimization for Lookup Table-based Deep Neural NetworksBinglei Lou, Ruilin Wu, Philip Leong2025-03-17下载The deployment of deep neural networks (DNNs) on resource-constrained edge devices such as field-programmable gate arrays (FPGAs) requires a careful balance of latency, power, and resource usage while...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Do Large Language Models Understand Performance Optimization?Bowen Cui, Tejas Ramesh, Oscar Hernandez, Keren Zhou2025-03-17下载Large Language Models (LLMs) have emerged as powerful tools for software development tasks such as code completion, translation, and optimization.
Container late-binding in unprivileged dHTC pilot systems on Kubernetes resourcesIgor Sfiligoi, Yunjin Zhu, Jaime Frey2025-03-17下载The scientific and research community has benefited greatly from containerized distributed High Throughput Computing (dHTC), both by enabling elastic scaling of user compute workloads to thousands of ...
FedVSR: Towards Model-Agnostic Federated Learning in Video Super-ResolutionAli Mollaahmadi Dehaghi, Hossein KhademSohi, Reza Razavi, Steve Drew, Mohammad Moshirpour2025-03-17下载Video super-resolution (VSR) aims to enhance low-resolution videos by leveraging both spatial and temporal information. While deep learning has led to impressive progress, it typically requires centra...
Exploring the Potential of Carbon-Aware Execution for Scientific WorkflowsKathleen West, Fabian Lehmann, Vasilis Bountris, Ulf Leser, Yehia Elkhatib, Lauritz Thamsen2025-03-17下载Scientific workflows are widely used to automate scientific data analysis and often involve processing large quantities of data on compute clusters.
Energy-Efficient and High-Performance Data Transfers with DRL AgentsHasibul Jamil, Jacob Goldverg, Elvis Rodrigues, MD S Q Zulkar Nine, Tevfik Kosar2025-03-17下载The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently.
XChainDataGen: A Cross-Chain Dataset Generation FrameworkAndré Augusto, André Vasconcelos, Miguel Correia, Luyao Zhang2025-03-17下载The number of blockchain interoperability protocols for transferring data and assets between blockchains has grown significantly. However, no open dataset of cross-chain transactions exists to study i...
SDFLMQ: A Semi-Decentralized Federated Learning Framework over MQTTAmir Ali-Pour, Julien Gascon-Samson2025-03-17下载Federated Learning is widely discussed as a distributed machine learning concept with stress on preserving data privacy. Various structures of Federated Learning were proposed.
Optimal Expert Selection for Distributed Mixture-of-Experts at the Wireless EdgeShengling Qin, Hai Wu, Hongyang Du, Kaibin Huang2025-03-17下载The emergence of distributed Mixture-of-Experts (DMoE) systems, which deploy expert models at edge nodes, offers a pathway to achieving connected intelligence in sixth-generation (6G) mobile networks ...
Scalable Runtime Architecture for Data-driven, Hybrid HPC and ML Workflow ApplicationsAndre Merzky, Mikhail Titov, Matteo Turilli, Ozgur Kilic, Tianle Wang, Shantenu Jha2025-03-17下载Hybrid workflows combining traditional HPC and novel ML methodologies are transforming scientific computing. This paper presents the architecture and implementation of a scalable runtime system that e...
Generative AI for Software Architecture. Applications, Challenges, and Future DirectionsMatteo Esposito, Xiaozhou Li, Sergio Moreschini, Noman Ahmad, Tomas Cerny, Karthik Vaidhyanathan, Valentina Lenarduzzi, Davide Taibi2025-03-17下载Context: Generative Artificial Intelligence (GenAI) is transforming much of software development, yet its application in software architecture is still in its infancy, and no prior study has systemati...
Zero-Knowledge Proof-Based Consensus for Blockchain-Secured Federated LearningTianxing Fu, Jia Hu, Geyong Min, Zi Wang2025-03-17下载Federated learning (FL) enables multiple participants to collaboratively train machine learning models while ensuring their data remains private and secure.
GC-Fed: Gradient Centralized Federated Learning with Partial Client ParticipationJungwon Seo, Ferhat Ozgur Catak, Chunming Rong, Kibeom Hong, Minhoe Kim2025-03-17下载Federated Learning (FL) enables privacy-preserving multi-source information fusion (MSIF) but is challenged by client drift in highly heterogeneous data settings.
ILVES: Accurate and efficient bond length and angle constraints in molecular dynamicsLorién López-Villellas, Carl Christian Kjelgaard Mikkelsen, Juan José Galano-Frutos, Santiago Marco-Sola, Jesús Alastruey-Benedé, Pablo Ibáñez, Pablo Echenique, Miquel Moretó, Maria Cristina De Rosa, Pablo García-Risueño2025-03-17下载All-atom, force field-based molecular dynamics simulations are essential tools in computational chemistry, enabling the prediction and analysis of biomolecular systems with atomic-level resolution.
WOW: Workflow-Aware Data Movement and Task Scheduling for Dynamic Scientific WorkflowsFabian Lehmann, Jonathan Bader, Friedrich Tschirpke, Ninon De Mecquenem, Ansgar Lößer, Soeren Becker, Katarzyna Ewa Lewińska, Lauritz Thamsen, Ulf Leser2025-03-17下载Scientific workflows process extensive data sets over clusters of independent nodes, which requires a complex stack of infrastructure components, especially a resource manager (RM) for task-to-node as...
Byzantine-Tolerant Consensus in GPU-Inspired Shared MemoryChryssis Georgiou, Manaswini Piduguralla, Sathya Peri2025-03-17下载In this work, we formalize a novel shared memory model inspired by the popular GPU architecture. Within this model, we develop algorithmic solutions to the Byzantine Consensus problem and analyze thei...
Understanding the Communication Needs of Asynchronous Many-Task Systems -- A Case Study of HPX+LCIJiakun Yan, Hartmut Kaiser, Marc Snir2025-03-17下载Asynchronous Many-Task (AMT) systems offer a potential solution for efficiently programming complicated scientific applications on extreme-scale heterogeneous architectures.
WRATH: Workload Resilience Across Task Hierarchies in Task-based Parallel Programming FrameworksSicheng Zhou, Zhuozhao Li, Valérie Hayot-Sasson, Haochen Pan, Maxime Gonthier, J. Gregory Pauloski, Ryan Chard, Kyle Chard, Ian Foster2025-03-17下载Failures in Task-based Parallel Programming (TBPP) can severely degrade performance and result in incomplete or incorrect outcomes. Existing failure-handling approaches, including reactive, proactive,...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Compact routing schemes in undirected and directed graphsAvi Kadria, Liam Roditty2025-03-17下载In this paper, we study the problem of compact routing schemes in weighted undirected and directed graphs. \textit{For weighted undirected graphs}, more than a decade ago, Chechik [PODC'13] presente...
Towards Energy- and QoS-aware Load Balancing for 6G: Leveraging O-RAN to Achieve Sustainable and Energy-Efficient 6GGustavo Z. Bruno, Gabriel M. Almeida, Aloizio Da Silva, Luiz A. DaSilva, Joao F. Santos, Alexandre Huff, Kleber V. Cardoso, Cristiano B. Both2025-03-17下载This paper addresses the critical challenge posed by the increasing energy consumption in mobile networks, particularly with the advent of Sixth Generation (6G) technologies.
Energy-Efficient and High-Performance Data Transfers with DRL AgentsHasibul Jamil, Jacob Goldverg, Elvis Rodrigues, MD S Q Zulkar Nine, Tevfik Kosar2025-03-17下载The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently.
Toward Generative 6G Simulation: An Experimental Multi-Agent LLM and ns-3 IntegrationFarhad Rezazadeh, Amir Ashtari Gargari, Sandra Lagen, Houbing Song, Dusit Niyato, Lingjia Liu2025-03-17下载The move toward open Sixth-Generation (6G) networks necessitates a novel approach to full-stack simulation environments for evaluating complex technology developments before prototyping and real-world...
Simulating Raman Scattering Impairments with Depolarization Noise in Quantum-Classical LinksJake Smith, Roberto Proietti2025-03-17下载We model spontaneous Raman scattering noise in polarization-encoded quantum communication channels co-propagating with classical signals using the depolarization channel.
Enable Time-Sensitive Applications in Kubernetes with Container Network Interface Plugin Agnostic Metadata ProxyFerenc Orosi, Ferenc Fejes2025-03-17下载Application deployment in cloud environment is dominated by Kubernetes-orchestrated microservices. Provides a secure environment, networking, storage, isolation, scheduling, and many other abstraction...
A Reference Architecture for Autonomous Networks: An Agent-Based ApproachJoseph Sifakis, Dongming Li, Hairong Huang, Yong Zhang, Wenshuan Dang, River Huang, Yijun Yu2025-03-17下载The vision of autonomous systems is becoming increasingly important in many application areas, where the aim is to replace humans with agents.
SafeSlice: Enabling SLA-Compliant O-RAN Slicing via Safe Deep Reinforcement LearningAhmad M. Nagib, Hatem Abou-Zeid, Hossam S. Hassanein2025-03-17下载Deep reinforcement learning (DRL)-based slicing policies have shown significant success in simulated environments but face challenges in physical systems such as open radio access networks (O-RANs) du...

cs.PF - Performance

标题作者发布日期PDF摘要
PrETi: Predicting Execution Time in Early Stage with LLVM and Machine LearningRisheng Xu, Philipp Sieweck, Hermann von Hasseln, Dirk Nowotka2025-03-17下载We introduce preti, a novel framework for predicting software execution time during the early stages of development. preti leverages an LLVM-based simulation environment to extract timing-related runt...
Energy-Efficient and High-Performance Data Transfers with DRL AgentsHasibul Jamil, Jacob Goldverg, Elvis Rodrigues, MD S Q Zulkar Nine, Tevfik Kosar2025-03-17下载The rapid growth of data across fields of science and industry has increased the need to improve the performance of end-to-end data transfers while using the resources more efficiently.
HERMES: High-Performance RISC-V Memory Hierarchy for ML WorkloadsPranav Suryadevara2025-03-17下载The growth of machine learning (ML) workloads has underscored the importance of efficient memory hierarchies to address bandwidth, latency, and scalability challenges.

基于 VitePress 构建