2024-03-11

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
TCAM-SSD: A Framework for Search-Based Computing in Solid-State Drives	Ryan Wong, Nikita Kim, Kevin Higgs, Sapan Agarwal, Engin Ipek, Saugata Ghose, Ben Feinberg	2024-03-11	下载	As the amount of data produced in society continues to grow at an exponential rate, modern applications are incurring significant performance and energy penalties due to high data movement between the...
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System	Hongsun Jang, Jaeyong Song, Jaewon Jung, Jaeyoung Park, Youngsok Kim, Jinho Lee	2024-03-11	下载	The recent huge advance of Large Language Models (LLMs) is mainly driven by the increase in the number of parameters. This has led to substantial memory capacity requirements, necessitating the use of...
From English to ASIC: Hardware Implementation with Large Language Model	Emil Goh, Maoyang Xiang, I-Chyn Wey, T. Hui Teo	2024-03-11	下载	In the realm of ASIC engineering, the landscape has been significantly reshaped by the rapid development of LLM, paralleled by an increase in the complexity of modern digital circuits.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
DrJAX: Scalable and Differentiable MapReduce Primitives in JAX	Keith Rush, Zachary Charles, Zachary Garrett, Sean Augenstein, Nicole Mitchell	2024-03-11	下载	We present DrJAX, a JAX-based library designed to support large-scale distributed and parallel machine learning algorithms that use MapReduce-style operations.
PISA: An Adversarial Approach To Comparing Task Graph Scheduling Algorithms	Jared Coleman, Bhaskar Krishnamachari	2024-03-11	下载	Scheduling a task graph representing an application over a heterogeneous network of computers is a fundamental problem in distributed computing.
Parameterized Task Graph Scheduling Algorithm for Comparing Algorithmic Components	Jared Coleman, Ravi Vivek Agrawal, Ebrahim Hirani, Bhaskar Krishnamachari	2024-03-11	下载	Scheduling distributed applications modeled as directed, acyclic task graphs to run on heterogeneous compute networks is a fundamental (NP-Hard) problem in distributed computing for which many heurist...
Optimizing sDTW for AMD GPUs	Daniel Latta-Lin, Sofia Isadora Padilla Munoz	2024-03-11	下载	Subsequence Dynamic Time Warping (sDTW) is the metric of choice when performing many sequence matching and alignment tasks. While sDTW is flexible and accurate, it is neither simple nor fast to comput...
Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-synchronous Federated Learning	Liangkun Yu, Xiang Sun, Rana Albelaihi, Chaeeun Park, Sihua Shao	2024-03-11	下载	Federated Learning (FL) revolutionizes collaborative machine learning among Internet of Things (IoT) devices by enabling them to train models collectively while preserving data privacy.
SFVInt: Simple, Fast and Generic Variable-Length Integer Decoding using Bit Manipulation Instructions	Gang Liao, Ye Liu, Yonghua Ding, Le Cai, Jianjun Chen	2024-03-11	下载	The ubiquity of variable-length integers in data storage and communication necessitates efficient decoding techniques. In this paper, we present SFVInt, a simple and fast approach to decode the preval...
Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction	Yury Demidovich, Grigory Malinovsky, Peter Richtárik	2024-03-11	下载	In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings.
Data Poisoning Attacks in Gossip Learning	Alexandre Pham, Maria Potop-Butucaru, Sébastien Tixeuil, Serge Fdida	2024-03-11	下载	Traditional machine learning systems were designed in a centralized manner. In such designs, the central entity maintains both the machine learning model and the data used to adjust the model's parame...
LoHan: Low-Cost High-Performance Framework to Fine-Tune 100B Model on a Consumer GPU	Changyue Liao, Mo Sun, Zihan Yang, Jun Xie, Kaiqi Chen, Binhang Yuan, Fei Wu, Zeke Wang	2024-03-11	下载	Nowadays, AI researchers become more and more interested in fine-tuning a pre-trained LLM, whose size has grown to up to over 100B parameters, for their downstream tasks.
A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge	Hasanul Mahmud, Peng Kang, Kevin Desai, Palden Lama, Sushil Prasad	2024-03-11	下载	Reducing inference time and energy usage while maintaining prediction accuracy has become a significant concern for deep neural networks (DNN) inference on resource-constrained edge devices.
AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping	Seongyeon Park, Junguk Hong, Jaeyong Song, Hajin Kim, Youngsok Kim, Jinho Lee	2024-03-11	下载	With the advance in genome sequencing technology, the lengths of deoxyribonucleic acid (DNA) sequencing results are rapidly increasing at lower prices than ever.
Accelerating Sparse Tensor Decomposition Using Adaptive Linearized Representation	Jan Laukemann, Ahmed E. Helal, S. Isaac Geronimo Anderson, Fabio Checconi, Yongseok Soh, Jesmin Jahan Tithi, Teresa Ranadive, Brian J Gravelle, Fabrizio Petrini, Jee Choi	2024-03-11	下载	High-dimensional sparse data emerge in many critical application domains such as healthcare and cybersecurity. To extract meaningful insights from massive volumes of these multi-dimensional data, scie...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
The World Wide Web in the Face of Future Internet Architectures	Harshad Shirwadkar	2024-03-11	下载	The World Wide Web WWW has arguably been the most popular application of the Internet for years. Over a period of time, it has developed over the principles of host centric IP internet.
Monitoring the Venice Lagoon: an IoT Cloud-Based Sensor Nerwork Approach	Filippo Campagnaro, Matin Ghalkhani, Riccardo Tumiati, Federico Marin, Matteo Del Grande, Alessandro Pozzebon, Davide De Battisti, Roberto Francescon, Michele Zorzi	2024-03-11	下载	Monitoring the coastal area of the Venice Lagoon is of significant importance. While the impact of global warming is felt worldwide, coastal and littoral regions bear the brunt more prominently.
Joint Source-and-Channel Coding for Small Satellite Applications	Olga Kondrateva, Stefan Dietzel, Björn Scheuermann	2024-03-11	下载	Small satellites are widely used today as cost effective means to perform Earth observation and other tasks that generate large amounts of high-dimensional data, such as multi-spectral imagery.
Adaptive Federated Learning Over the Air	Chenhao Wang, Zihan Chen, Nikolaos Pappas, Howard H. Yang, Tony Q. S. Quek, H. Vincent Poor	2024-03-11	下载	We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Next4: Snapshots in Ext4 File System	Aditya Dani, Shardul Mangade, Piyush Nimbalkar, Harshad Shirwadkar	2024-03-11	下载	The growing value of data as a strategic asset has given rise to the necessity of implementing reliable backup and recovery solutions in the most efficient and cost-effective manner.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Tail Optimality and Performance Analysis of the Nudge-M Scheduling Algorithm	Nils Charlet, Benny Van Houdt	2024-03-11	下载	Recently it was shown that the response time of First-Come-First-Served (FCFS) scheduling can be stochastically and asymptotically improved upon by the {\it Nudge} scheduling algorithm in case of ligh...
Accelerating Sparse Tensor Decomposition Using Adaptive Linearized Representation	Jan Laukemann, Ahmed E. Helal, S. Isaac Geronimo Anderson, Fabio Checconi, Yongseok Soh, Jesmin Jahan Tithi, Teresa Ranadive, Brian J Gravelle, Fabrizio Petrini, Jee Choi	2024-03-11	下载	High-dimensional sparse data emerge in many critical application domains such as healthcare and cybersecurity. To extract meaningful insights from massive volumes of these multi-dimensional data, scie...