Skip to content

2024-04-16

cs.AR - Architecture

标题作者发布日期PDF摘要
Tao: Re-Thinking DL-based Microarchitecture SimulationSantosh Pandey, Amir Yazdanbakhsh, Hang Liu2024-04-16下载Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements.
From a Lossless (~1.5:1) Compression Algorithm for Llama2 7B Weights to Variable Precision, Variable Range, Compressed Numeric Data Types for CNNs and LLMsVincenzo Liguori2024-04-16下载This paper starts with a simple lossless ~1.5:1 compression algorithm for the weights of the Large Language Model (LLM) Llama2 7B [1] that can be implemented in ~200 LUTs in AMD FPGAs, processing over...
SA-DS: A Dataset for Large Language Model-Driven AI Accelerator Design GenerationDeepak Vungarala, Mahmoud Nazzal, Mehrdad Morsali, Chao Zhang, Arnob Ghosh, Abdallah Khreishah, Shaahin Angizi2024-04-16下载In the ever-evolving landscape of Deep Neural Networks (DNN) hardware acceleration, unlocking the true potential of systolic array accelerators has long been hindered by the daunting challenges of exp...
Decade-Bandwidth RF-Input Pseudo-Doherty Load Modulated Balanced Amplifier using Signal-Flow-Based Phase Alignment DesignPingzhu Gong, Jiachen Guo, Niteesh Bharadwaj Vangipurapu, Kenle Chen2024-04-16下载This paper reports a first-ever decade-bandwidth pseudo-Doherty load-modulated balanced amplifier (PD-LMBA), designed for emerging 4G/5G communications and multi-band operations.
AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDsSungjun Cho, Beomjun Kim, Hyunuk Cho, Gyeongseob Seo, Onur Mutlu, Myungsuk Kim, Jisung Park2024-04-16下载This work investigates a new erase scheme in NAND flash memory to improve the lifetime and performance of modern solid-state drives (SSDs). In NAND flash memory, an erase operation applies a high volt...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Personalized Federated Learning via StackingEmilio Cantu-Cervini2024-04-16下载Traditional Federated Learning (FL) methods typically train a single global model collaboratively without exchanging raw data. In contrast, Personalized Federated Learning (PFL) techniques aim to crea...
GPU-Based Parallel Computing Methods for Medical Photoacoustic Image ReconstructionXinyao Yi, Yuxin Qiao2024-04-16下载Recent years have witnessed a rapid advancement in GPU technology, establishing it as a formidable high-performance parallel computing technology with superior floating-point computational capabilitie...
Trackable Agent-based Evolution Models at Wafer ScaleMatthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman2024-04-16下载Continuing improvements in computing hardware are poised to transform capabilities for in silico modeling of cross-scale phenomena underlying major open questions in evolutionary biology and artificia...
Sustainability of Data Center Digital Twins with Reinforcement LearningSoumyendu Sarkar, Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Ashwin Ramesh Babu, Sajad Mousavi2024-04-16下载The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption.
Computing Inductive Invariants of Regular Abstraction FrameworksPhilipp Czerner, Javier Esparza, Valentin Krasotin, Christoph Welzel-Mohr2024-04-16下载Regular transition systems (RTS) are a popular formalism for modeling infinite-state systems in general, and parameterised systems in particular. In a CONCUR 22 paper, Esparza et al.
Classical and Quantum Distributed Algorithms for the Survivable Network Design ProblemPhillip Kerger, David E. Bernal Neira, Zoe Gonzalez Izquierdo, Eleanor G. Rieffel2024-04-16下载We investigate distributed classical and quantum approaches for the survivable network design problem (SNDP), sometimes called the generalized Steiner problem.
A Cloud Resources Portfolio Optimization Business Model -- From Theory to PracticeValentin Haag, Maximilian Kiessler, Benedikt Pittl, Erich Schikuta2024-04-16下载Cloud resources have become increasingly important, with many businesses using cloud solutions to supplement or outright replace their existing IT infrastructure.
Benchmarking Machine Learning Applications on Heterogeneous Architecture using ReframeChristopher Rae, Joseph K. L. Lee, James Richings, Michele Weiland2024-04-16下载With the rapid increase in machine learning workloads performed on HPC systems, it is beneficial to regularly perform machine learning specific benchmarks to monitor performance and identify issues.
LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Embodied Intelligence SystemShijing Hu, Zhihui Lu, Xin Xu, Ruijun Deng, Xin Du, Qiang Duan2024-04-16下载Embodied intelligence (EI) enables manufacturing systems to flexibly perceive, reason, adapt, and operate within dynamic shop floor environments.
BoLD: Fast and Cheap Dispute ResolutionMario M. Alvarez, Henry Arneson, Ben Berger, Lee Bousfield, Chris Buckland, Yafah Edelman, Edward W. Felten, Daniel Goldman, Raul Jordan, Mahimna Kelkar, Akaki Mamageishvili, Harry Ng, Aman Sanghi, Victor Shoup, Terence Tsao2024-04-16下载BoLD is a new dispute resolution protocol that is designed to replace the originally deployed Arbitrum dispute resolution protocol. Unlike that protocol, BoLD is resistant to delay attacks.
Paving the Way to Hybrid Quantum-Classical Scientific WorkflowsSandeep Suresh Cranganore, Vincenzo De Maio, Ivona Brandic, Ewa Deelman2024-04-16下载The increasing growth of data volume, and the consequent explosion in demand for computational power, are affecting scientific computing, as shown by the rise of extreme data scientific workflows.
I/O in Machine Learning Applications on HPC Systems: A 360-degree SurveyNoah Lewis, Jean Luca Bez, Surendra Byna2024-04-16下载Growing interest in Artificial Intelligence (AI) has resulted in a surge in demand for faster methods of Machine Learning (ML) model training and inference.
Accelerating Particle-in-Cell Monte Carlo Simulations with MPI, OpenMP/OpenACC and Asynchronous Multi-GPU ProgrammingJeremy J. Williams, Felix Liu, Jordy Trilaksono, David Tskhakaya, Stefan Costea, Leon Kos, Ales Podolnik, Jakub Hromadka, Pratibha Hegde, Marta Garcia-Gasulla, Valentin Seitz, Frank Jenko, Erwin Laure, Stefano Markidis2024-04-16下载As fusion energy devices advance, plasma simulations are crucial for reactor design. Our work extends BIT1 hybrid parallelization by integrating MPI with OpenMP and OpenACC, focusing on asynchronous m...
Privacy-Enhanced Training-as-a-Service for On-Device Intelligence: Concept, Architectural Scheme, and Open ProblemsZhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Bo Gao, Tianliu He, Wen Wang2024-04-16下载On-device intelligence (ODI) enables artificial intelligence (AI) applications to run on end devices, providing real-time and customized AI inference without relying on remote servers.
Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model DevelopmentXiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiaojing Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu, Yankun Xu, Xiuwen Guo, Yunlong Fei, Zhaoying Wang, Mingkui Li, Yingjing Jiang, Lv Lu, Liang Su, Jiayu Fu, Peinan Yu, Weiguo Liu, Lixin Wu, Lanning Wang, Xin Liu, Dexun Chen, Guangwen Yang2024-04-16下载With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing mo...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Top-k Multi-Armed Bandit Learning for Content Dissemination in Swarms of Micro-UAVsAmit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas2024-04-16下载This paper presents a Micro-Unmanned Aerial Vehicle (UAV)-enhanced content management system for disaster scenarios where communication infrastructure is generally compromised.
A Calibrated and Automated Simulator for Innovations in 5GConrado Boeira, Antor Hasan, Khaleda Papry, Yue Ju, Zhongwen Zhu, Israat Haque2024-04-16下载The rise of 5G deployments has created the environment for many emerging technologies to flourish. Self-driving vehicles, Augmented and Virtual Reality, and remote operations are examples of applicati...
Exploring Post Quantum Cryptography with Quantum Key Distribution for Sustainable Mobile Network Architecture DesignSanzida Hoque, Abdullah Aydeger, Engin Zeydan2024-04-16下载The proliferation of mobile networks and their increasing importance to modern life, combined with the emerging threat of quantum computing, present new challenges and opportunities for cybersecurity.
Generative AI for Advanced UAV NetworkingGeng Sun, Wenwen Xie, Dusit Niyato, Hongyang Du, Jiawen Kang, Jing Wu, Sumei Sun, Ping Zhang2024-04-16下载With the impressive achievements of chatGPT and Sora, generative artificial intelligence (GAI) has received increasing attention. Not limited to the field of content generation, GAI is also widely use...
Learning Wireless Data Knowledge Graph for Green Intelligent Communications: Methodology and ExperimentsYongming Huang, Xiaohu You, Hang Zhan, Shiwen He, Ningning Fu, Wei Xu2024-04-16下载Intelligent communications have played a pivotal role in shaping the evolution of 6G networks. Native artificial intelligence (AI) within green communication systems must meet stringent real-time requ...
Smart Pilot Assignment for IoT in Massive MIMO Systems: A Path Towards Scalable IoT InfrastructureMuhammad Kamran Saeed, Ashfaq Khokhar2024-04-16下载5G sets the foundation for an era of creativity with its faster speeds, increased data throughput, reduced latency, and enhanced IoT connectivity, all enabled by Massive MIMO (M-MIMO) technology.

cs.PF - Performance

标题作者发布日期PDF摘要
VDTuner: Automated Performance Tuning for Vector Data Management SystemsTiannuo Yang, Wen Hu, Wangqi Peng, Yusen Li, Jianguo Li, Gang Wang, Xiaoguang Liu2024-04-16下载Vector data management systems (VDMSs) have become an indispensable cornerstone in large-scale information retrieval and machine learning systems like large language models.
Accelerating Particle-in-Cell Monte Carlo Simulations with MPI, OpenMP/OpenACC and Asynchronous Multi-GPU ProgrammingJeremy J. Williams, Felix Liu, Jordy Trilaksono, David Tskhakaya, Stefan Costea, Leon Kos, Ales Podolnik, Jakub Hromadka, Pratibha Hegde, Marta Garcia-Gasulla, Valentin Seitz, Frank Jenko, Erwin Laure, Stefano Markidis2024-04-16下载As fusion energy devices advance, plasma simulations are crucial for reactor design. Our work extends BIT1 hybrid parallelization by integrating MPI with OpenMP and OpenACC, focusing on asynchronous m...

基于 VitePress 构建