Skip to content

2024-01-25

cs.AR - Architecture

标题作者发布日期PDF摘要
Designing Silicon Brains using LLM: Leveraging ChatGPT for Automated Description of a Spiking Neuron ArrayMichael Tomlinson, Joe Li, Andreas Andreou2024-01-25下载Large language models (LLMs) have made headlines for synthesizing correct-sounding responses to a variety of prompts, including code generation.
InfiniteEn: A Multi-Source Energy Harvesting System with Load Monitoring Module for Batteryless Internet of ThingsPriyesh Pappinisseri Puluckul, Maarten Weyn2024-01-25下载This paper presents InfiniteEn, a multi-source energy harvesting platform designed for the Internet of Batteryless Things (IoBT). InfiniteEn incorporates an efficient energy combiner to combine energy...
Evaluation of POSIT Arithmetic with AcceleratorsNaohito Nakasato, Yuki Murakami, Fumiya Kono, Maho Nakata2024-01-25下载We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional p...
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignHaojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song2024-01-25下载Six-bit quantization (FP6) can effectively reduce the size of large language models (LLMs) and preserve the model quality consistently across varied applications.
Towards Cheaper Inference in Deep Networks with Lower Bit-Width AccumulatorsYaniv Blumenfeld, Itay Hubara, Daniel Soudry2024-01-25下载The majority of the research on the quantization of Deep Neural Networks (DNNs) is focused on reducing the precision of tensors visible by high-level frameworks (e.g.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Interactive and Urgent HPC: Challenges and OpportunitiesAlbert Reuther, Nick Brown, William Arndt, Johannes Blaschke, Christian Boehme, Antony Chazapis, Bjoern Enders, Robert Henschel, Julian Kunkel, Maxime Martinasso2024-01-25下载As a broader set of applications from simulations to data analysis and machine learning require more parallel computational capability, the demand for interactive and urgent high performance computing...
Unsealing the secrets of blockchain consensus: A systematic comparison of the formal security of proof-of-work and proof-of-stakeIván Abellán Álvarez, Vincent Gramlich, Johannes Sedlmeir2024-01-25下载With the increasing adoption of decentralized information systems based on a variety of permissionless blockchain networks, the choice of consensus mechanism is at the core of many controversial discu...
The Case for Co-Designing Model Architectures with HardwareQuentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar Panda2024-01-25下载While GPUs are responsible for training the vast majority of state-of-the-art deep learning models, the implications of their architecture are often overlooked when designing new deep learning (DL) mo...
ServerlessLLM: Low-Latency Serverless Inference for Large Language ModelsYao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai2024-01-25下载This paper presents ServerlessLLM, a distributed system designed to support low-latency serverless inference for Large Language Models (LLMs).
CHIRON: Accelerating Node Synchronization without Security Trade-offs in Distributed LedgersRay Neiheiser, Arman Babaei, Giannis Alexopoulos, Marios Kogias, Eleftherios Kokoris Kogias2024-01-25下载Blockchain performance has historically faced challenges posed by the throughput limitations of consensus algorithms. Recent breakthroughs in research have successfully alleviated these constraints by...
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side DistillationVasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi, Nirvana Meratnia2024-01-25下载Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices while preserving data privacy.
Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart CamerasChulhong Min, Juheon Yi, Utku Gunay Acer, Fahim Kawsar2024-01-25下载Overlapping cameras offer exciting opportunities to view a scene from different angles, allowing for more advanced, comprehensive and robust analysis.
Evaluation of POSIT Arithmetic with AcceleratorsNaohito Nakasato, Yuki Murakami, Fumiya Kono, Maho Nakata2024-01-25下载We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional p...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
5G Network Security Practices: An Overview and SurveyFatema Bannat Wala, Mariam Kiran2024-01-25下载This document provides an overview of 5G network security, describing various components of the 5G core network architecture and what kind of security services are offered by these 5G components.
Multicasting Optical Reconfigurable SwitchNiyazi Ulas Dinc, Mustafa Yildirim, Ilker Oguz, Christophe Moser, Demetri Psaltis2024-01-25下载Artificial Intelligence (AI) demands large data flows within datacenters, heavily relying on multicasting data transfers. As AI models scale, the requirement for high-bandwidth and low-latency network...
Model CBOR Serialization for Federated LearningKoen Zandberg, Mayank Gulati, Gerhard Wunder, Emmanuel Baccelli2024-01-25下载The typical federated learning workflow requires communication between a central server and a large set of clients synchronizing model parameters between each other.

cs.PF - Performance

标题作者发布日期PDF摘要
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert CacheLeyang Xue, Yao Fu, Zhan Lu, Luo Mai, Mahesh Marina2024-01-25下载This paper presents MoE-Infinity, an efficient MoE inference system designed for personal machines with limited GPU memory capacity. The key idea for MoE-Infinity is that on personal machines, which a...

基于 VitePress 构建