Appearance
2024-04-16
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Tao: Re-Thinking DL-based Microarchitecture Simulation | Santosh Pandey, Amir Yazdanbakhsh, Hang Liu | 2024-04-16 | 下载 | Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements. |
| From a Lossless (~1.5:1) Compression Algorithm for Llama2 7B Weights to Variable Precision, Variable Range, Compressed Numeric Data Types for CNNs and LLMs | Vincenzo Liguori | 2024-04-16 | 下载 | This paper starts with a simple lossless ~1.5:1 compression algorithm for the weights of the Large Language Model (LLM) Llama2 7B [1] that can be implemented in ~200 LUTs in AMD FPGAs, processing over... |
| SA-DS: A Dataset for Large Language Model-Driven AI Accelerator Design Generation | Deepak Vungarala, Mahmoud Nazzal, Mehrdad Morsali, Chao Zhang, Arnob Ghosh, Abdallah Khreishah, Shaahin Angizi | 2024-04-16 | 下载 | In the ever-evolving landscape of Deep Neural Networks (DNN) hardware acceleration, unlocking the true potential of systolic array accelerators has long been hindered by the daunting challenges of exp... |
| Decade-Bandwidth RF-Input Pseudo-Doherty Load Modulated Balanced Amplifier using Signal-Flow-Based Phase Alignment Design | Pingzhu Gong, Jiachen Guo, Niteesh Bharadwaj Vangipurapu, Kenle Chen | 2024-04-16 | 下载 | This paper reports a first-ever decade-bandwidth pseudo-Doherty load-modulated balanced amplifier (PD-LMBA), designed for emerging 4G/5G communications and multi-band operations. |
| AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs | Sungjun Cho, Beomjun Kim, Hyunuk Cho, Gyeongseob Seo, Onur Mutlu, Myungsuk Kim, Jisung Park | 2024-04-16 | 下载 | This work investigates a new erase scheme in NAND flash memory to improve the lifetime and performance of modern solid-state drives (SSDs). In NAND flash memory, an erase operation applies a high volt... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Personalized Federated Learning via Stacking | Emilio Cantu-Cervini | 2024-04-16 | 下载 | Traditional Federated Learning (FL) methods typically train a single global model collaboratively without exchanging raw data. In contrast, Personalized Federated Learning (PFL) techniques aim to crea... |
| GPU-Based Parallel Computing Methods for Medical Photoacoustic Image Reconstruction | Xinyao Yi, Yuxin Qiao | 2024-04-16 | 下载 | Recent years have witnessed a rapid advancement in GPU technology, establishing it as a formidable high-performance parallel computing technology with superior floating-point computational capabilitie... |
| Trackable Agent-based Evolution Models at Wafer Scale | Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman | 2024-04-16 | 下载 | Continuing improvements in computing hardware are poised to transform capabilities for in silico modeling of cross-scale phenomena underlying major open questions in evolutionary biology and artificia... |
| Sustainability of Data Center Digital Twins with Reinforcement Learning | Soumyendu Sarkar, Avisek Naug, Antonio Guillen, Ricardo Luna, Vineet Gundecha, Ashwin Ramesh Babu, Sajad Mousavi | 2024-04-16 | 下载 | The rapid growth of machine learning (ML) has led to an increased demand for computational power, resulting in larger data centers (DCs) and higher energy consumption. |
| Computing Inductive Invariants of Regular Abstraction Frameworks | Philipp Czerner, Javier Esparza, Valentin Krasotin, Christoph Welzel-Mohr | 2024-04-16 | 下载 | Regular transition systems (RTS) are a popular formalism for modeling infinite-state systems in general, and parameterised systems in particular. In a CONCUR 22 paper, Esparza et al. |
| Classical and Quantum Distributed Algorithms for the Survivable Network Design Problem | Phillip Kerger, David E. Bernal Neira, Zoe Gonzalez Izquierdo, Eleanor G. Rieffel | 2024-04-16 | 下载 | We investigate distributed classical and quantum approaches for the survivable network design problem (SNDP), sometimes called the generalized Steiner problem. |
| A Cloud Resources Portfolio Optimization Business Model -- From Theory to Practice | Valentin Haag, Maximilian Kiessler, Benedikt Pittl, Erich Schikuta | 2024-04-16 | 下载 | Cloud resources have become increasingly important, with many businesses using cloud solutions to supplement or outright replace their existing IT infrastructure. |
| Benchmarking Machine Learning Applications on Heterogeneous Architecture using Reframe | Christopher Rae, Joseph K. L. Lee, James Richings, Michele Weiland | 2024-04-16 | 下载 | With the rapid increase in machine learning workloads performed on HPC systems, it is beneficial to regularly perform machine learning specific benchmarks to monitor performance and identify issues. |
| LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Embodied Intelligence System | Shijing Hu, Zhihui Lu, Xin Xu, Ruijun Deng, Xin Du, Qiang Duan | 2024-04-16 | 下载 | Embodied intelligence (EI) enables manufacturing systems to flexibly perceive, reason, adapt, and operate within dynamic shop floor environments. |
| BoLD: Fast and Cheap Dispute Resolution | Mario M. Alvarez, Henry Arneson, Ben Berger, Lee Bousfield, Chris Buckland, Yafah Edelman, Edward W. Felten, Daniel Goldman, Raul Jordan, Mahimna Kelkar, Akaki Mamageishvili, Harry Ng, Aman Sanghi, Victor Shoup, Terence Tsao | 2024-04-16 | 下载 | BoLD is a new dispute resolution protocol that is designed to replace the originally deployed Arbitrum dispute resolution protocol. Unlike that protocol, BoLD is resistant to delay attacks. |
| Paving the Way to Hybrid Quantum-Classical Scientific Workflows | Sandeep Suresh Cranganore, Vincenzo De Maio, Ivona Brandic, Ewa Deelman | 2024-04-16 | 下载 | The increasing growth of data volume, and the consequent explosion in demand for computational power, are affecting scientific computing, as shown by the rise of extreme data scientific workflows. |
| I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey | Noah Lewis, Jean Luca Bez, Surendra Byna | 2024-04-16 | 下载 | Growing interest in Artificial Intelligence (AI) has resulted in a surge in demand for faster methods of Machine Learning (ML) model training and inference. |
| Accelerating Particle-in-Cell Monte Carlo Simulations with MPI, OpenMP/OpenACC and Asynchronous Multi-GPU Programming | Jeremy J. Williams, Felix Liu, Jordy Trilaksono, David Tskhakaya, Stefan Costea, Leon Kos, Ales Podolnik, Jakub Hromadka, Pratibha Hegde, Marta Garcia-Gasulla, Valentin Seitz, Frank Jenko, Erwin Laure, Stefano Markidis | 2024-04-16 | 下载 | As fusion energy devices advance, plasma simulations are crucial for reactor design. Our work extends BIT1 hybrid parallelization by integrating MPI with OpenMP and OpenACC, focusing on asynchronous m... |
| Privacy-Enhanced Training-as-a-Service for On-Device Intelligence: Concept, Architectural Scheme, and Open Problems | Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Bo Gao, Tianliu He, Wen Wang | 2024-04-16 | 下载 | On-device intelligence (ODI) enables artificial intelligence (AI) applications to run on end devices, providing real-time and customized AI inference without relying on remote servers. |
| Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development | Xiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiaojing Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu, Yankun Xu, Xiuwen Guo, Yunlong Fei, Zhaoying Wang, Mingkui Li, Yingjing Jiang, Lv Lu, Liang Su, Jiayu Fu, Peinan Yu, Weiguo Liu, Lixin Wu, Lanning Wang, Xin Liu, Dexun Chen, Guangwen Yang | 2024-04-16 | 下载 | With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing mo... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Top-k Multi-Armed Bandit Learning for Content Dissemination in Swarms of Micro-UAVs | Amit Kumar Bhuyan, Hrishikesh Dutta, Subir Biswas | 2024-04-16 | 下载 | This paper presents a Micro-Unmanned Aerial Vehicle (UAV)-enhanced content management system for disaster scenarios where communication infrastructure is generally compromised. |
| A Calibrated and Automated Simulator for Innovations in 5G | Conrado Boeira, Antor Hasan, Khaleda Papry, Yue Ju, Zhongwen Zhu, Israat Haque | 2024-04-16 | 下载 | The rise of 5G deployments has created the environment for many emerging technologies to flourish. Self-driving vehicles, Augmented and Virtual Reality, and remote operations are examples of applicati... |
| Exploring Post Quantum Cryptography with Quantum Key Distribution for Sustainable Mobile Network Architecture Design | Sanzida Hoque, Abdullah Aydeger, Engin Zeydan | 2024-04-16 | 下载 | The proliferation of mobile networks and their increasing importance to modern life, combined with the emerging threat of quantum computing, present new challenges and opportunities for cybersecurity. |
| Generative AI for Advanced UAV Networking | Geng Sun, Wenwen Xie, Dusit Niyato, Hongyang Du, Jiawen Kang, Jing Wu, Sumei Sun, Ping Zhang | 2024-04-16 | 下载 | With the impressive achievements of chatGPT and Sora, generative artificial intelligence (GAI) has received increasing attention. Not limited to the field of content generation, GAI is also widely use... |
| Learning Wireless Data Knowledge Graph for Green Intelligent Communications: Methodology and Experiments | Yongming Huang, Xiaohu You, Hang Zhan, Shiwen He, Ningning Fu, Wei Xu | 2024-04-16 | 下载 | Intelligent communications have played a pivotal role in shaping the evolution of 6G networks. Native artificial intelligence (AI) within green communication systems must meet stringent real-time requ... |
| Smart Pilot Assignment for IoT in Massive MIMO Systems: A Path Towards Scalable IoT Infrastructure | Muhammad Kamran Saeed, Ashfaq Khokhar | 2024-04-16 | 下载 | 5G sets the foundation for an era of creativity with its faster speeds, increased data throughput, reduced latency, and enhanced IoT connectivity, all enabled by Massive MIMO (M-MIMO) technology. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| VDTuner: Automated Performance Tuning for Vector Data Management Systems | Tiannuo Yang, Wen Hu, Wangqi Peng, Yusen Li, Jianguo Li, Gang Wang, Xiaoguang Liu | 2024-04-16 | 下载 | Vector data management systems (VDMSs) have become an indispensable cornerstone in large-scale information retrieval and machine learning systems like large language models. |
| Accelerating Particle-in-Cell Monte Carlo Simulations with MPI, OpenMP/OpenACC and Asynchronous Multi-GPU Programming | Jeremy J. Williams, Felix Liu, Jordy Trilaksono, David Tskhakaya, Stefan Costea, Leon Kos, Ales Podolnik, Jakub Hromadka, Pratibha Hegde, Marta Garcia-Gasulla, Valentin Seitz, Frank Jenko, Erwin Laure, Stefano Markidis | 2024-04-16 | 下载 | As fusion energy devices advance, plasma simulations are crucial for reactor design. Our work extends BIT1 hybrid parallelization by integrating MPI with OpenMP and OpenACC, focusing on asynchronous m... |