Appearance
2024-08-05
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Evaluating Large Language Models for Automatic Register Transfer Logic Generation via High-Level Synthesis | Sneha Swaroopa, Rijoy Mukherjee, Anushka Debnath, Rajat Subhra Chakraborty | 2024-08-05 | 下载 | The ever-growing popularity of large language models (LLMs) has resulted in their increasing adoption for hardware design and verification. Prior research has attempted to assess the capability of LLM... |
| Finite-Time Lyapunov Exponent Calculation on FPGA using High-Level Synthesis Tools | Manuel de Castro, Roberto R. Osorio, Francisco J. Andujar, Rocío Carratalá-Sáez, Yuri Torres, Diego R. Llanos | 2024-08-05 | 下载 | As Field Programmable Gate Arrays (FPGAs) computing capabilities continue to grow, also does the interest on building scientific accelerators around them. |
| Toward Attention-based TinyML: A Heterogeneous Accelerated Architecture and Automated Deployment Flow | Philip Wiese, Gamze İslamoğlu, Moritz Scherer, Luka Macan, Victor J. B. Jung, Alessio Burrello, Francesco Conti, Luca Benini | 2024-08-05 | 下载 | One of the challenges for Tiny Machine Learning (tinyML) is keeping up with the evolution of Machine Learning models from Convolutional Neural Networks to Transformers. |
| PENDRAM: Enabling High-Performance and Energy-Efficient Processing of Deep Neural Networks through a Generalized DRAM Data Mapping Policy | Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique | 2024-08-05 | 下载 | Convolutional Neural Networks (CNNs), a prominent type of Deep Neural Networks (DNNs), have emerged as a state-of-the-art solution for solving machine learning tasks. |
| TrIM, Triangular Input Movement Systolic Array for Convolutional Neural Networks: Architecture and Hardware Implementation | Cristian Sestito, Shady Agwa, Themis Prodromakis | 2024-08-05 | 下载 | Modern hardware architectures for Convolutional Neural Networks (CNNs), other than targeting high performance, aim at dissipating limited energy. |
| SLO-aware GPU Frequency Scaling for Energy Efficient LLM Inference Serving | Andreas Kosmas Kakolyris, Dimosthenis Masouros, Petros Vavaroutsos, Sotirios Xydis, Dimitrios Soudris | 2024-08-05 | 下载 | As Large Language Models (LLMs) gain traction, their reliance on power-hungry GPUs places ever-increasing energy demands, raising environmental and monetary concerns. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Mitigating Malicious Attacks in Federated Learning via Confidence-aware Defense | Qilei Li, Ahmed M. Abdelmoniem | 2024-08-05 | 下载 | Federated Learning (FL) is a distributed machine learning diagram that enables multiple clients to collaboratively train a global model without sharing their private local data. |
| Toward Smart Scheduling in Tapis | Joe Stubbs, Smruti Padhy, Richard Cardone | 2024-08-05 | 下载 | The Tapis framework provides APIs for automating job execution on remote resources, including HPC clusters and servers running in the cloud. Tapis can simplify the interaction with remote cyberinfrast... |
| Solving Large Rank-Deficient Linear Least-Squares Problems on Shared-Memory CPU Architectures and GPU Architectures | Mónica Chillarón, Gregorio Quintana-Ortí, Vicente Vidal, Per-Gunnar Martinsson | 2024-08-05 | 下载 | Solving very large linear systems of equations is a key computational task in science and technology. In many cases, the coefficient matrix of the linear system is rank-deficient, leading to systems t... |
| Asynchronous Latency and Fast Atomic Snapshot | João Paulo Bezerra, Luciano Freitas, Petr Kuznetsov, Matthieu Rambaud | 2024-08-05 | 下载 | This paper introduces a novel, fast atomic-snapshot protocol for asynchronous message-passing systems. In the process of defining what ``fast'' means exactly, we spot a few interesting issues that ari... |
| SLO-aware GPU Frequency Scaling for Energy Efficient LLM Inference Serving | Andreas Kosmas Kakolyris, Dimosthenis Masouros, Petros Vavaroutsos, Sotirios Xydis, Dimitrios Soudris | 2024-08-05 | 下载 | As Large Language Models (LLMs) gain traction, their reliance on power-hungry GPUs places ever-increasing energy demands, raising environmental and monetary concerns. |
| Nonlinear Perturbation-based Non-Convex Optimization over Time-Varying Networks | Mohammadreza Doostmohammadian, Zulfiya R. Gabidullina, Hamid R. Rabiee | 2024-08-05 | 下载 | Decentralized optimization strategies are helpful for various applications, from networked estimation to distributed machine learning. This paper studies finite-sum minimization problems described ove... |
| Large Language Model Aided QoS Prediction for Service Recommendation | Huiying Liu, Zekun Zhang, Honghao Li, Qilin Wu, Yiwen Zhang | 2024-08-05 | 下载 | Large language models (LLMs) have seen rapid improvement in the recent years, and have been used in a wider range of applications. After being trained on large text corpus, LLMs obtain the capability ... |
| Enabling Practical Transparent Checkpointing for MPI: A Topological Sort Approach | Yao Xu, Gene Cooperman | 2024-08-05 | 下载 | MPI is the de facto standard for parallel computing on a cluster of computers. Checkpointing is an important component in any strategy for software resilience and for long-running jobs that must be ex... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Active Learning for WBAN-based Health Monitoring | Cho-Chun Chiu, Tuan Nguyen, Ting He, Shiqiang Wang, Beom-Su Kim, Ki-Il Kim | 2024-08-05 | 下载 | We consider a novel active learning problem motivated by the need of learning machine learning models for health monitoring in wireless body area network (WBAN). |
| Performance analysis of a RIS-assisted communications | Hamza Adrat, Laurent Decreusefond, Philippe Martins | 2024-08-05 | 下载 | Reconfigurable Intelligent Surfaces (RIS) are currently considered for adoption in future 6G stantards. ETSI and 3GPP have started feasibility and performance investigations of such a technology. |
| Demystifying AMD SEV Performance Penalty for NFV Deployment | Syafiq Al Atiiq, Aris Cahyadi Risdianto | 2024-08-05 | 下载 | Network Function Virtualization (NFV) has shifted communication networks towards more adaptable software solutions, but this transition raises new security concerns, particularly in public cloud deplo... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Toward Smart Scheduling in Tapis | Joe Stubbs, Smruti Padhy, Richard Cardone | 2024-08-05 | 下载 | The Tapis framework provides APIs for automating job execution on remote resources, including HPC clusters and servers running in the cloud. Tapis can simplify the interaction with remote cyberinfrast... |
| Solving Large Rank-Deficient Linear Least-Squares Problems on Shared-Memory CPU Architectures and GPU Architectures | Mónica Chillarón, Gregorio Quintana-Ortí, Vicente Vidal, Per-Gunnar Martinsson | 2024-08-05 | 下载 | Solving very large linear systems of equations is a key computational task in science and technology. In many cases, the coefficient matrix of the linear system is rank-deficient, leading to systems t... |