Skip to content

2025-09-10

cs.AR - Architecture

标题作者发布日期PDF摘要
LLM-VeriPPA: Power, Performance, and Area Optimization aware Verilog Code Generation with Large Language ModelsKiran Thorat, Jiahui Zhao, Yaotian Liu, Amit Hasan, Hongwu Peng, Xi Xie, Bin Lei, Caiwen Ding2025-09-10下载Large Language Models (LLMs) are gaining prominence in various fields, thanks to their ability to generate high- quality content from human instructions.
Securing Cryptographic Software via Typed Assembly Language (Extended Version)Shixin Song, Tingzhen Dong, Kosi Nwabueze, Julian Zanders, Andres Erbsen, Adam Chlipala, Mengjia Yan2025-09-10下载Authors of cryptographic software are well aware that their code should not leak secrets through its timing behavior, and, until 2018, they believed that following industry-standard constant-time codi...
BitROM: Weight Reload-Free CiROM Architecture Towards Billion-Parameter 1.58-bit LLM InferenceWenlun Zhang, Xinyu Li, Shimpei Ando, Kentaro Yoshioka2025-09-10下载Compute-in-Read-Only-Memory (CiROM) accelerators offer outstanding energy efficiency for CNNs by eliminating runtime weight updates. However, their scalability to Large Language Models (LLMs) is funda...
Decentor-V: Lightweight ML Training on Low-Power RISC-V Edge DevicesMarcelo Ribeiro, Diogo Costa, Gonçalo Moreira, Sandro Pinto, Tiago Gomes2025-09-10下载Modern IoT devices increasingly rely on machine learning solutions to process data locally. However, the lack of graphics processing units (GPUs) or dedicated accelerators on most platforms makes on-d...
AutoVeriFix: Automatically Correcting Errors and Enhancing Functional Correctness in LLM-Generated Verilog CodeYan Tan, Xiangchen Meng, Zijun Jiang, Yangdi Lyu2025-09-10下载Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as Python and C++.
FASE: FPGA-Assisted Syscall Emulation for Rapid End-to-End Processor Performance ValidationChengzhen Meng, Xiuzhuang Chen, Hongjun Dai2025-09-10下载The rapid advancement of AI workloads and domain-specific architectures has led to increasingly diverse processor microarchitectures, whose design exploration requires fast and accurate performance va...
Aurora: Architecting Argonne's First Exascale Supercomputer for Accelerated Scientific DiscoveryWilliam E. Allcock, Benjamin S. Allen, James Anchell, Victor Anisimov, Thomas Applencourt, Abhishek Bagusetty, Ramesh Balakrishnan, Riccardo Balin, Solomon Bekele, Colleen Bertoni, Cyrus Blackworth, Renzo Bustamante, Kevin Canada, John Carrier, Christopher Chan-nui, Lance C. Cheney, Taylor Childers, Paul Coffman, Susan Coghlan, Tanima Dey, Michael D'Mello, Ashok Emani, Murali Emani, Kyle G. Felker, Sam Foreman, Olivier Franza, Longfei Gao, Marta García, María Garzarán, Balazs Gerofi, Yasaman Ghadar, Subrata Goswami, Neha Gupta, Kevin Harms, Väinö Hatanpää, Brian Holland, Carissa Holohan, Brian Homerding, Khalid Hossain, Xue Hu, Louise Huot, Huda Ibeid, Joseph A. Insley, Sai Jayanthi, Hong Jiang, Wei Jiang, Xiao-Yong Jin, Jeongnim Kim, Christopher Knight, Panagiotis Kourdis, Kalyan Kumaran, JaeHyuk Kwack, Janghaeng Lee, Ti Leggett, Ben Lenard, Chris Lewis, Nevin Liber, Johann Lombardi, Raymond M. Loy, Ye Luo, Bethany Lusch, Nilakantan Mahadevan, Beth Markey, Victor A. Mateevitsi, Gordon McPheeters, Ryan Milner, Jerome Mitchell, Vitali A. Morozov, Servesh Muralidharan, Tom Musta, Mrigendra Nagar, Vikram Narayana, Marieme Ngom, Anthony-Trung Nguyen, Nathan Nichols, Aditya Nishtala, James C. Osborn, Michael E. Papka, Scott Parker, Saumil S. Patel, Julia Piotrowska, Adrian C. Pope, Sucheta Raghunanda, Esteban Rangel, Paul M. Rich, Katherine M. Riley, Silvio Rizzi, Kris Rowe, Varuni Sastry, Adam Scovel, Filippo Simini, Haritha Siddabathuni Som, Patrick Steinbrecher, Rick Stevens, Xinmin Tian, Peter Upton, Thomas Uram, Archit K. Vasan, Álvaro Vázquez-Mayagoitia, Kaushik Velusamy, Brice Videau, Venkatram Vishwanath, Brian Whitney, Timothy J. Williams, Michael Woodacre, Sam Zeltner, Chuanjun Zhang, Gengbin Zheng, Huihuo Zheng2025-09-10下载Aurora is Argonne National Laboratory's pioneering Exascale supercomputer, designed to accelerate scientific discovery with cutting-edge architectural innovations.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Optimizing the Variant Calling Pipeline Execution on Human Genomes Using GPU-Enabled MachinesAjay Kumar, Praveen Rao, Peter Sanders2025-09-10下载Variant calling is the first step in analyzing a human genome and aims to detect variants in an individual's genome compared to a reference genome.
HARD: A Performance Portable Radiation Hydrodynamics Code based on FleCSI FrameworkJulien Loiseau, Hyun Lim, Andrés Yagüe López, Mammadbaghir Baghirzade, Shihab Shahriar Khan, Yoonsoo Kim, Sudarshan Neopane, Alexander Strack, Farhana Taiyebah, Benjamin K. Bergen2025-09-10下载Hydrodynamics And Radiation Diffusion} (HARD) is an open-source application for high-performance simulations of compressible hydrodynamics with radiation-diffusion coupling.
A Comparative Analysis of Identifier Schemes: UUIDv4, UUIDv7, and ULID for Distributed SystemsNima Karimian Kakolaki2025-09-10下载Distributed systems require robust, scalable identifier schemes to ensure data uniqueness and efficient indexing across multiple nodes. This paper presents a comprehensive analysis of the evolution of...
Reconfigurable Holographic Surfaces and Near Field Communication for Non-Terrestrial Networks: Potential and ChallengesMuhammad Ali Jamshed, Muhammad Ahmed Mohsin, Hongliang Zhang, Bushra Haq, Aryan Kaushik, Boya Di, Weiwei Jiang2025-09-10下载To overcome the challenges of ultra-low latency, ubiquitous coverage, and soaring data rates, this article presents a combined use of Near Field Communication (NFC) and Reconfigurable Holographic Surf...
A 410GFLOP/s, 64 RISC-V Cores, 204.8GBps Shared-Memory Cluster in 12nm FinFET with Systolic Execution Support for Efficient B5G/6G AI-Enhanced O-RANYichao Zhang, Marco Bertuletti, Sergio Mazzola, Samuel Riedel, Luca Benini2025-09-10下载We present HeartStream, a 64-RV-core shared-L1-memory cluster (410 GFLOP/s peak performance and 204.8 GBps L1 bandwidth) for energy-efficient AI-enhanced O-RAN.
A Coopetitive-Compatible Data Generation Framework for Cross-silo Federated LearningThanh Linh Nguyen, Quoc-Viet Pham2025-09-10下载Cross-silo federated learning (CFL) enables organizations (e.g., hospitals or banks) to collaboratively train artificial intelligence (AI) models while preserving data privacy by keeping data local.
DSFL: A Dual-Server Byzantine-Resilient Federated Learning Framework via Group-Based Secure AggregationCharuka Herath, Yogachandran Rahulamathavan, Varuna De Silva, Sangarapillai Lambotharan2025-09-10下载Federated Learning (FL) enables decentralized model training without sharing raw data, offering strong privacy guarantees. However, existing FL protocols struggle to defend against Byzantine participa...
Towards Communication-Efficient Decentralized Federated Graph Learning over Non-IID DataShilong Wang, Jianchun Liu, Hongli Xu, Chenxia Tang, Qianpiao Ma, Liusheng Huang2025-09-10下载Decentralized Federated Graph Learning (DFGL) overcomes potential bottlenecks of the parameter server in FGL by establishing a peer-to-peer (P2P) communication network among workers.
An HPC Benchmark Survey and Taxonomy for CharacterizationAndreas Herten, Olga Pearce, Filipe S. M. Guimarães2025-09-10下载The field of High-Performance Computing (HPC) is defined by providing computing devices with highest performance for a variety of demanding scientific users.
Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic ParallelismZizhao Mo, Jianxiong Liao, Huanle Xu, Zhi Zhou, Chengzhong Xu2025-09-10下载The significant resource demands in LLM serving prompts production clusters to fully utilize heterogeneous hardware by partitioning LLM models across a mix of high-end and low-end GPUs.
Design and Implementation of Code Completion System Based on LLM and CodeBERT Hybrid SubsystemBingbing Zhang, Ziyu Lin, Yingxin Su2025-09-10下载In the rapidly evolving industry of software development, coding efficiency and accuracy play significant roles in delivering high-quality software.
Aurora: Architecting Argonne's First Exascale Supercomputer for Accelerated Scientific DiscoveryWilliam E. Allcock, Benjamin S. Allen, James Anchell, Victor Anisimov, Thomas Applencourt, Abhishek Bagusetty, Ramesh Balakrishnan, Riccardo Balin, Solomon Bekele, Colleen Bertoni, Cyrus Blackworth, Renzo Bustamante, Kevin Canada, John Carrier, Christopher Chan-nui, Lance C. Cheney, Taylor Childers, Paul Coffman, Susan Coghlan, Tanima Dey, Michael D'Mello, Ashok Emani, Murali Emani, Kyle G. Felker, Sam Foreman, Olivier Franza, Longfei Gao, Marta García, María Garzarán, Balazs Gerofi, Yasaman Ghadar, Subrata Goswami, Neha Gupta, Kevin Harms, Väinö Hatanpää, Brian Holland, Carissa Holohan, Brian Homerding, Khalid Hossain, Xue Hu, Louise Huot, Huda Ibeid, Joseph A. Insley, Sai Jayanthi, Hong Jiang, Wei Jiang, Xiao-Yong Jin, Jeongnim Kim, Christopher Knight, Panagiotis Kourdis, Kalyan Kumaran, JaeHyuk Kwack, Janghaeng Lee, Ti Leggett, Ben Lenard, Chris Lewis, Nevin Liber, Johann Lombardi, Raymond M. Loy, Ye Luo, Bethany Lusch, Nilakantan Mahadevan, Beth Markey, Victor A. Mateevitsi, Gordon McPheeters, Ryan Milner, Jerome Mitchell, Vitali A. Morozov, Servesh Muralidharan, Tom Musta, Mrigendra Nagar, Vikram Narayana, Marieme Ngom, Anthony-Trung Nguyen, Nathan Nichols, Aditya Nishtala, James C. Osborn, Michael E. Papka, Scott Parker, Saumil S. Patel, Julia Piotrowska, Adrian C. Pope, Sucheta Raghunanda, Esteban Rangel, Paul M. Rich, Katherine M. Riley, Silvio Rizzi, Kris Rowe, Varuni Sastry, Adam Scovel, Filippo Simini, Haritha Siddabathuni Som, Patrick Steinbrecher, Rick Stevens, Xinmin Tian, Peter Upton, Thomas Uram, Archit K. Vasan, Álvaro Vázquez-Mayagoitia, Kaushik Velusamy, Brice Videau, Venkatram Vishwanath, Brian Whitney, Timothy J. Williams, Michael Woodacre, Sam Zeltner, Chuanjun Zhang, Gengbin Zheng, Huihuo Zheng2025-09-10下载Aurora is Argonne National Laboratory's pioneering Exascale supercomputer, designed to accelerate scientific discovery with cutting-edge architectural innovations.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Robust Belief-State Policy Learning for Quantum Network Routing Under Decoherence and Time-Varying ConditionsAmirhossein Taherpour, Abbas Taherpour, Tamer Khattab2025-09-10下载This paper presents a feature-based Partially Observable Markov Decision Process (POMDP) framework for quantum network routing, combining belief-state planning with Graph Neural Networks (GNNs) to add...
The Role of Legacy Mobile Networks in Infrastructure Resilience: Evidence from the Southern Brazil FloodDaniel Meyer, Lisandro Z Granville, Leandro M. Bertholdo2025-09-10下载This paper investigates the resilience of mobile communication networks during the extreme flooding that affected Rio Grande do Sul, Brazil, in May 2024.
Design and Development of a Scalable and Energy-Efficient Localization Framework Leveraging LoRa Ranging-Capable TransceiversHasan Albinsaid, Bodhibrata Mukhopadhyay, Mohamed-Slim Alouini2025-09-10下载Precise and energy-efficient localization is a critical requirement in many Internet of Things (IoT) applications, particularly in large-scale deployments such as asset tagging, agriculture, and smart...
SKYLINK: Scalable and Resilient Link Management in LEO Satellite NetworkWanja de Sombre, Arash Asadi, Debopam Bhattacherjee, Deepak Vasisht, Andrea Ortiz2025-09-10下载The rapid growth of space-based services has established LEO satellite networks as a promising option for global broadband connectivity. Next-generation LEO networks leverage inter-satellite links (IS...
Ubiquitous Intelligence Via Wireless Network-Driven LLMs EvolutionXingkun Yin, Feiran You, Hongyang Du, Kaibin Huang2025-09-10下载We introduce ubiquitous intelligence as a paradigm where Large Language Models (LLMs) evolve within wireless network-driven ecosystems. Unlike static model deployments, this approach enables scalable ...
From Physical to Logical: Graph-State-Based Connectivity in Quantum NetworksMateo M. Blanco, Manuel Fernández-Veiga, Ana Fernández-Vilas, Rebeca P. Díaz-Redondo2025-09-10下载Entanglement is a key resource in quantum communication, but bipartite schemes are often insufficient for advanced protocols like quantum secret sharing or distributed computing.
Enhancing 6G Network Security and Incident Response through Integrated VNF and SDN TechnologiesAbdul Razaque, Abitkhanova Zhadyra Abitkhanovna2025-09-10下载Low-speed internet can negatively affect incident response in a number of ways, including decreased teamwork, delayed detection, inefficient action, and elevated risk.
EFPIX: A zero-trust encrypted flood protocolArin Upadhyay2025-09-10下载We propose EFPIX (Encrypted Flood Protocol for Information eXchange), a flood-based relay communication protocol that achieves end-to-end encryption, plausible deniability for users, and untraceable m...

cs.PF - Performance

标题作者发布日期PDF摘要
Time-Fair Benchmarking for Metaheuristics: A Restart-Fair Protocol for Fixed-Time ComparisonsJunbo Jacob Lian2025-09-10下载Numerous purportedly improved metaheuristics claim superior performance based on equivalent function evaluations (FEs), yet often conceal additional computational burdens in more intensive iterations,...
Memshare: Memory Sharing for Multicore Computation in R with an Application to Feature Selection by Mutual Information using PDEMichael C. Thrun, Julian Märte2025-09-10下载We present memshare\footnote{The Software package is published as a CRAN package under https://CRAN.R-project.org/package=memshare, a package that enables shared memory multicore computation in R by a...
Noise Injection for__Performance Bottleneck AnalysisAurélien Delval, Pablo de Oliveira Castro, William Jalby, Etienne Renault2025-09-10下载Bottleneck evaluation plays a crucial part in performance tuning of HPC applications, as it directly influences the search for optimizations and the selection of the best hardware for a given code.
Aurora: Architecting Argonne's First Exascale Supercomputer for Accelerated Scientific DiscoveryWilliam E. Allcock, Benjamin S. Allen, James Anchell, Victor Anisimov, Thomas Applencourt, Abhishek Bagusetty, Ramesh Balakrishnan, Riccardo Balin, Solomon Bekele, Colleen Bertoni, Cyrus Blackworth, Renzo Bustamante, Kevin Canada, John Carrier, Christopher Chan-nui, Lance C. Cheney, Taylor Childers, Paul Coffman, Susan Coghlan, Tanima Dey, Michael D'Mello, Ashok Emani, Murali Emani, Kyle G. Felker, Sam Foreman, Olivier Franza, Longfei Gao, Marta García, María Garzarán, Balazs Gerofi, Yasaman Ghadar, Subrata Goswami, Neha Gupta, Kevin Harms, Väinö Hatanpää, Brian Holland, Carissa Holohan, Brian Homerding, Khalid Hossain, Xue Hu, Louise Huot, Huda Ibeid, Joseph A. Insley, Sai Jayanthi, Hong Jiang, Wei Jiang, Xiao-Yong Jin, Jeongnim Kim, Christopher Knight, Panagiotis Kourdis, Kalyan Kumaran, JaeHyuk Kwack, Janghaeng Lee, Ti Leggett, Ben Lenard, Chris Lewis, Nevin Liber, Johann Lombardi, Raymond M. Loy, Ye Luo, Bethany Lusch, Nilakantan Mahadevan, Beth Markey, Victor A. Mateevitsi, Gordon McPheeters, Ryan Milner, Jerome Mitchell, Vitali A. Morozov, Servesh Muralidharan, Tom Musta, Mrigendra Nagar, Vikram Narayana, Marieme Ngom, Anthony-Trung Nguyen, Nathan Nichols, Aditya Nishtala, James C. Osborn, Michael E. Papka, Scott Parker, Saumil S. Patel, Julia Piotrowska, Adrian C. Pope, Sucheta Raghunanda, Esteban Rangel, Paul M. Rich, Katherine M. Riley, Silvio Rizzi, Kris Rowe, Varuni Sastry, Adam Scovel, Filippo Simini, Haritha Siddabathuni Som, Patrick Steinbrecher, Rick Stevens, Xinmin Tian, Peter Upton, Thomas Uram, Archit K. Vasan, Álvaro Vázquez-Mayagoitia, Kaushik Velusamy, Brice Videau, Venkatram Vishwanath, Brian Whitney, Timothy J. Williams, Michael Woodacre, Sam Zeltner, Chuanjun Zhang, Gengbin Zheng, Huihuo Zheng2025-09-10下载Aurora is Argonne National Laboratory's pioneering Exascale supercomputer, designed to accelerate scientific discovery with cutting-edge architectural innovations.

基于 VitePress 构建