Skip to content

2024-03-11

cs.AR - Architecture

标题作者发布日期PDF摘要
TCAM-SSD: A Framework for Search-Based Computing in Solid-State DrivesRyan Wong, Nikita Kim, Kevin Higgs, Sapan Agarwal, Engin Ipek, Saugata Ghose, Ben Feinberg2024-03-11下载As the amount of data produced in society continues to grow at an exponential rate, modern applications are incurring significant performance and energy penalties due to high data movement between the...
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real SystemHongsun Jang, Jaeyong Song, Jaewon Jung, Jaeyoung Park, Youngsok Kim, Jinho Lee2024-03-11下载The recent huge advance of Large Language Models (LLMs) is mainly driven by the increase in the number of parameters. This has led to substantial memory capacity requirements, necessitating the use of...
From English to ASIC: Hardware Implementation with Large Language ModelEmil Goh, Maoyang Xiang, I-Chyn Wey, T. Hui Teo2024-03-11下载In the realm of ASIC engineering, the landscape has been significantly reshaped by the rapid development of LLM, paralleled by an increase in the complexity of modern digital circuits.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
DrJAX: Scalable and Differentiable MapReduce Primitives in JAXKeith Rush, Zachary Charles, Zachary Garrett, Sean Augenstein, Nicole Mitchell2024-03-11下载We present DrJAX, a JAX-based library designed to support large-scale distributed and parallel machine learning algorithms that use MapReduce-style operations.
PISA: An Adversarial Approach To Comparing Task Graph Scheduling AlgorithmsJared Coleman, Bhaskar Krishnamachari2024-03-11下载Scheduling a task graph representing an application over a heterogeneous network of computers is a fundamental problem in distributed computing.
Parameterized Task Graph Scheduling Algorithm for Comparing Algorithmic ComponentsJared Coleman, Ravi Vivek Agrawal, Ebrahim Hirani, Bhaskar Krishnamachari2024-03-11下载Scheduling distributed applications modeled as directed, acyclic task graphs to run on heterogeneous compute networks is a fundamental (NP-Hard) problem in distributed computing for which many heurist...
Optimizing sDTW for AMD GPUsDaniel Latta-Lin, Sofia Isadora Padilla Munoz2024-03-11下载Subsequence Dynamic Time Warping (sDTW) is the metric of choice when performing many sequence matching and alignment tasks. While sDTW is flexible and accurate, it is neither simple nor fast to comput...
Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-synchronous Federated LearningLiangkun Yu, Xiang Sun, Rana Albelaihi, Chaeeun Park, Sihua Shao2024-03-11下载Federated Learning (FL) revolutionizes collaborative machine learning among Internet of Things (IoT) devices by enabling them to train models collectively while preserving data privacy.
SFVInt: Simple, Fast and Generic Variable-Length Integer Decoding using Bit Manipulation InstructionsGang Liao, Ye Liu, Yonghua Ding, Le Cai, Jianjun Chen2024-03-11下载The ubiquity of variable-length integers in data storage and communication necessitates efficient decoding techniques. In this paper, we present SFVInt, a simple and fast approach to decode the preval...
Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance ReductionYury Demidovich, Grigory Malinovsky, Peter Richtárik2024-03-11下载In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings.
Data Poisoning Attacks in Gossip LearningAlexandre Pham, Maria Potop-Butucaru, Sébastien Tixeuil, Serge Fdida2024-03-11下载Traditional machine learning systems were designed in a centralized manner. In such designs, the central entity maintains both the machine learning model and the data used to adjust the model's parame...
LoHan: Low-Cost High-Performance Framework to Fine-Tune 100B Model on a Consumer GPUChangyue Liao, Mo Sun, Zihan Yang, Jun Xie, Kaiqi Chen, Binhang Yuan, Fei Wu, Zeke Wang2024-03-11下载Nowadays, AI researchers become more and more interested in fine-tuning a pre-trained LLM, whose size has grown to up to over 100B parameters, for their downstream tasks.
A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the EdgeHasanul Mahmud, Peng Kang, Kevin Desai, Palden Lama, Sushil Prasad2024-03-11下载Reducing inference time and energy usage while maintaining prediction accuracy has become a significant concern for deep neural networks (DNN) inference on resource-constrained edge devices.
AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read MappingSeongyeon Park, Junguk Hong, Jaeyong Song, Hajin Kim, Youngsok Kim, Jinho Lee2024-03-11下载With the advance in genome sequencing technology, the lengths of deoxyribonucleic acid (DNA) sequencing results are rapidly increasing at lower prices than ever.
Accelerating Sparse Tensor Decomposition Using Adaptive Linearized RepresentationJan Laukemann, Ahmed E. Helal, S. Isaac Geronimo Anderson, Fabio Checconi, Yongseok Soh, Jesmin Jahan Tithi, Teresa Ranadive, Brian J Gravelle, Fabrizio Petrini, Jee Choi2024-03-11下载High-dimensional sparse data emerge in many critical application domains such as healthcare and cybersecurity. To extract meaningful insights from massive volumes of these multi-dimensional data, scie...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
The World Wide Web in the Face of Future Internet ArchitecturesHarshad Shirwadkar2024-03-11下载The World Wide Web WWW has arguably been the most popular application of the Internet for years. Over a period of time, it has developed over the principles of host centric IP internet.
Monitoring the Venice Lagoon: an IoT Cloud-Based Sensor Nerwork ApproachFilippo Campagnaro, Matin Ghalkhani, Riccardo Tumiati, Federico Marin, Matteo Del Grande, Alessandro Pozzebon, Davide De Battisti, Roberto Francescon, Michele Zorzi2024-03-11下载Monitoring the coastal area of the Venice Lagoon is of significant importance. While the impact of global warming is felt worldwide, coastal and littoral regions bear the brunt more prominently.
Joint Source-and-Channel Coding for Small Satellite ApplicationsOlga Kondrateva, Stefan Dietzel, Björn Scheuermann2024-03-11下载Small satellites are widely used today as cost effective means to perform Earth observation and other tasks that generate large amounts of high-dimensional data, such as multi-spectral imagery.
Adaptive Federated Learning Over the AirChenhao Wang, Zihan Chen, Nikolaos Pappas, Howard H. Yang, Tony Q. S. Quek, H. Vincent Poor2024-03-11下载We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Next4: Snapshots in Ext4 File SystemAditya Dani, Shardul Mangade, Piyush Nimbalkar, Harshad Shirwadkar2024-03-11下载The growing value of data as a strategic asset has given rise to the necessity of implementing reliable backup and recovery solutions in the most efficient and cost-effective manner.

cs.PF - Performance

标题作者发布日期PDF摘要
Tail Optimality and Performance Analysis of the Nudge-M Scheduling AlgorithmNils Charlet, Benny Van Houdt2024-03-11下载Recently it was shown that the response time of First-Come-First-Served (FCFS) scheduling can be stochastically and asymptotically improved upon by the {\it Nudge} scheduling algorithm in case of ligh...
Accelerating Sparse Tensor Decomposition Using Adaptive Linearized RepresentationJan Laukemann, Ahmed E. Helal, S. Isaac Geronimo Anderson, Fabio Checconi, Yongseok Soh, Jesmin Jahan Tithi, Teresa Ranadive, Brian J Gravelle, Fabrizio Petrini, Jee Choi2024-03-11下载High-dimensional sparse data emerge in many critical application domains such as healthcare and cybersecurity. To extract meaningful insights from massive volumes of these multi-dimensional data, scie...

基于 VitePress 构建