Appearance
2024-01-25
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Designing Silicon Brains using LLM: Leveraging ChatGPT for Automated Description of a Spiking Neuron Array | Michael Tomlinson, Joe Li, Andreas Andreou | 2024-01-25 | 下载 | Large language models (LLMs) have made headlines for synthesizing correct-sounding responses to a variety of prompts, including code generation. |
| InfiniteEn: A Multi-Source Energy Harvesting System with Load Monitoring Module for Batteryless Internet of Things | Priyesh Pappinisseri Puluckul, Maarten Weyn | 2024-01-25 | 下载 | This paper presents InfiniteEn, a multi-source energy harvesting platform designed for the Internet of Batteryless Things (IoBT). InfiniteEn incorporates an efficient energy combiner to combine energy... |
| Evaluation of POSIT Arithmetic with Accelerators | Naohito Nakasato, Yuki Murakami, Fumiya Kono, Maho Nakata | 2024-01-25 | 下载 | We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional p... |
| FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design | Haojun Xia, Zhen Zheng, Xiaoxia Wu, Shiyang Chen, Zhewei Yao, Stephen Youn, Arash Bakhtiari, Michael Wyatt, Donglin Zhuang, Zhongzhu Zhou, Olatunji Ruwase, Yuxiong He, Shuaiwen Leon Song | 2024-01-25 | 下载 | Six-bit quantization (FP6) can effectively reduce the size of large language models (LLMs) and preserve the model quality consistently across varied applications. |
| Towards Cheaper Inference in Deep Networks with Lower Bit-Width Accumulators | Yaniv Blumenfeld, Itay Hubara, Daniel Soudry | 2024-01-25 | 下载 | The majority of the research on the quantization of Deep Neural Networks (DNNs) is focused on reducing the precision of tensors visible by high-level frameworks (e.g. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Interactive and Urgent HPC: Challenges and Opportunities | Albert Reuther, Nick Brown, William Arndt, Johannes Blaschke, Christian Boehme, Antony Chazapis, Bjoern Enders, Robert Henschel, Julian Kunkel, Maxime Martinasso | 2024-01-25 | 下载 | As a broader set of applications from simulations to data analysis and machine learning require more parallel computational capability, the demand for interactive and urgent high performance computing... |
| Unsealing the secrets of blockchain consensus: A systematic comparison of the formal security of proof-of-work and proof-of-stake | Iván Abellán Álvarez, Vincent Gramlich, Johannes Sedlmeir | 2024-01-25 | 下载 | With the increasing adoption of decentralized information systems based on a variety of permissionless blockchain networks, the choice of consensus mechanism is at the core of many controversial discu... |
| The Case for Co-Designing Model Architectures with Hardware | Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar Panda | 2024-01-25 | 下载 | While GPUs are responsible for training the vast majority of state-of-the-art deep learning models, the implications of their architecture are often overlooked when designing new deep learning (DL) mo... |
| ServerlessLLM: Low-Latency Serverless Inference for Large Language Models | Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai | 2024-01-25 | 下载 | This paper presents ServerlessLLM, a distributed system designed to support low-latency serverless inference for Large Language Models (LLMs). |
| CHIRON: Accelerating Node Synchronization without Security Trade-offs in Distributed Ledgers | Ray Neiheiser, Arman Babaei, Giannis Alexopoulos, Marios Kogias, Eleftherios Kokoris Kogias | 2024-01-25 | 下载 | Blockchain performance has historically faced challenges posed by the throughput limitations of consensus algorithms. Recent breakthroughs in research have successfully alleviated these constraints by... |
| Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation | Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi, Nirvana Meratnia | 2024-01-25 | 下载 | Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices while preserving data privacy. |
| Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart Cameras | Chulhong Min, Juheon Yi, Utku Gunay Acer, Fahim Kawsar | 2024-01-25 | 下载 | Overlapping cameras offer exciting opportunities to view a scene from different angles, allowing for more advanced, comprehensive and robust analysis. |
| Evaluation of POSIT Arithmetic with Accelerators | Naohito Nakasato, Yuki Murakami, Fumiya Kono, Maho Nakata | 2024-01-25 | 下载 | We present an evaluation of 32-bit POSIT arithmetic through its implementation as accelerators on FPGAs and GPUs. POSIT, a floating-point number format, adaptively changes the size of its fractional p... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| 5G Network Security Practices: An Overview and Survey | Fatema Bannat Wala, Mariam Kiran | 2024-01-25 | 下载 | This document provides an overview of 5G network security, describing various components of the 5G core network architecture and what kind of security services are offered by these 5G components. |
| Multicasting Optical Reconfigurable Switch | Niyazi Ulas Dinc, Mustafa Yildirim, Ilker Oguz, Christophe Moser, Demetri Psaltis | 2024-01-25 | 下载 | Artificial Intelligence (AI) demands large data flows within datacenters, heavily relying on multicasting data transfers. As AI models scale, the requirement for high-bandwidth and low-latency network... |
| Model CBOR Serialization for Federated Learning | Koen Zandberg, Mayank Gulati, Gerhard Wunder, Emmanuel Baccelli | 2024-01-25 | 下载 | The typical federated learning workflow requires communication between a central server and a large set of clients synchronizing model parameters between each other. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache | Leyang Xue, Yao Fu, Zhan Lu, Luo Mai, Mahesh Marina | 2024-01-25 | 下载 | This paper presents MoE-Infinity, an efficient MoE inference system designed for personal machines with limited GPU memory capacity. The key idea for MoE-Infinity is that on personal machines, which a... |