Appearance
2024-06-15
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Triangel: A High-Performance, Accurate, Timely On-Chip Temporal Prefetcher | Sam Ainsworth, Lev Mukhanov | 2024-06-15 | 下载 | Temporal prefetching, where correlated pairs of addresses are logged and replayed on repeat accesses, has recently become viable in commercial designs. |
| Efficient Hardware Accelerator Based on Medium Granularity Dataflow for SpTRSV | Qian Chen, Xiaofeng Yang, Shengli Lu | 2024-06-15 | 下载 | Sparse triangular solve (SpTRSV) is widely used in various domains. Numerous studies have been conducted using CPUs, GPUs, and specific hardware accelerators, where dataflows can be categorized into c... |
| FuseMax: Leveraging Extended Einsums to Optimize Attention Accelerator Design | Nandeeka Nayak, Xinrui Wu, Toluwanimi O. Odemuyiwa, Michael Pellauer, Joel S. Emer, Christopher W. Fletcher | 2024-06-15 | 下载 | Attention for transformers is a critical workload that has recently received significant "attention" as a target for custom acceleration. Yet, while prior work succeeds in reducing attention's memory-... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Breaking the Memory Wall: A Study of I/O Patterns and GPU Memory Utilization for Hybrid CPU-GPU Offloaded Optimizers | Avinash Maurya, Jie Ye, M. Mustafa Rafique, Franck Cappello, Bogdan Nicolae | 2024-06-15 | 下载 | Transformers and LLMs have seen rapid adoption in all domains. Their sizes have exploded to hundreds of billions of parameters and keep increasing. |
| DataStates-LLM: Lazy Asynchronous Checkpointing for Large Language Models | Avinash Maurya, Robert Underwood, M. Mustafa Rafique, Franck Cappello, Bogdan Nicolae | 2024-06-15 | 下载 | LLMs have seen rapid adoption in all domains. They need to be trained on high-end high-performance computing (HPC) infrastructures and ingest massive amounts of input data. |
| HiFGL: A Hierarchical Framework for Cross-silo Cross-device Federated Graph Learning | Zhuoning Guo, Duanyi Yao, Qiang Yang, Hao Liu | 2024-06-15 | 下载 | Federated Graph Learning (FGL) has emerged as a promising way to learn high-quality representations from distributed graph data with privacy preservation. |
| Efficient Hardware Accelerator Based on Medium Granularity Dataflow for SpTRSV | Qian Chen, Xiaofeng Yang, Shengli Lu | 2024-06-15 | 下载 | Sparse triangular solve (SpTRSV) is widely used in various domains. Numerous studies have been conducted using CPUs, GPUs, and specific hardware accelerators, where dataflows can be categorized into c... |
| Federated Neural Radiance Field for Distributed Intelligence | Yintian Zhang, Ziyu Shao | 2024-06-15 | 下载 | Novel view synthesis (NVS) is an important technology for many AR and VR applications. The recently proposed Neural Radiance Field (NeRF) approach has demonstrated superior performance on NVS tasks, a... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Novel Joint DRL-Based Utility Optimization for UAV Data Services | Xuli Cai, Poonam Lohan, Burak Kantarci | 2024-06-15 | 下载 | In this paper, we propose a novel joint deep reinforcement learning (DRL)-based solution to optimize the utility of an uncrewed aerial vehicle (UAV)-assisted communication network. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ROSfs: A User-Level File System for ROS | Zijun Xu, Xuanjun Wen, Yanjie Song, Shu Yin | 2024-06-15 | 下载 | We present ROSfs, a novel user-level file system for the Robot Operating System (ROS). ROSfs interprets a robot file as a group of sub-files, with each having a distinct label. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Efficient Hardware Accelerator Based on Medium Granularity Dataflow for SpTRSV | Qian Chen, Xiaofeng Yang, Shengli Lu | 2024-06-15 | 下载 | Sparse triangular solve (SpTRSV) is widely used in various domains. Numerous studies have been conducted using CPUs, GPUs, and specific hardware accelerators, where dataflows can be categorized into c... |