Appearance
2024-03-30
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| DE-HNN: An effective neural model for Circuit Netlist representation | Zhishang Luo, Truong Son Hy, Puoya Tabaghi, Donghyeon Koh, Michael Defferrard, Elahe Rezaei, Ryan Carey, Rhett Davis, Rajeev Jain, Yusu Wang | 2024-03-30 | 下载 | The run-time for optimization tools used in chip design has grown with the complexity of designs to the point where it can take several days to go through one design cycle which has become a bottlenec... |
| Inexactness and Correction of Floating-Point Reciprocal, Division and Square Root | Lucas M. Dutton, Christopher Kumar Anand, Robert Enenkel, Silvia Melitta Müller | 2024-03-30 | 下载 | Floating-point arithmetic performance determines the overall performance of important applications, from graphics to AI. Meeting the IEEE-754 specification for floating-point requires that final resul... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Computation and Communication Efficient Lightweighting Vertical Federated Learning for Smart Building IoT | Heqiang Wang, Xiang Liu, Yucheng Liu, Jia Zhou, Weihong Yang, Xiaoxiong Zhong | 2024-03-30 | 下载 | With the increasing number and enhanced capabilities of IoT devices in smart buildings, these devices are evolving beyond basic data collection and control to actively participate in deep learning tas... |
| Communication Efficient Distributed Training with Distributed Lion | Bo Liu, Lemeng Wu, Lizhang Chen, Kaizhao Liang, Jiaxu Zhu, Chen Liang, Raghuraman Krishnamoorthi, Qiang Liu | 2024-03-30 | 下载 | The Lion optimizer has been a promising competitor with the AdamW for training large AI models, with advantages on memory, computation, and sample efficiency. |
| Going Forward-Forward in Distributed Deep Learning | Ege Aktemur, Ege Zorlutuna, Kaan Bilgili, Tacettin Emre Bok, Berrin Yanikoglu, Suha Orhun Mutluergil | 2024-03-30 | 下载 | We introduce a new approach in distributed deep learning, utilizing Geoffrey Hinton's Forward-Forward (FF) algorithm to speed up the training of neural networks in distributed computing environments. |
| Asymptotically Optimal Scheduling of Multiple Parallelizable Job Classes | Benjamin Berg, Benjamin Moseley, Weina Wang, Mor Harchol-Balter | 2024-03-30 | 下载 | Modern computing workloads are often composed of parallelizable jobs. A parallelizable job can be completed more quickly when run on additional servers. |
| Engineering A Workload-balanced Push-Relabel Algorithm for Massive Graphs on GPUs | Chou-Ying Hsieh, Po-Chieh Lin, Sy-Yen Kuo | 2024-03-30 | 下载 | The push-relabel algorithm is an efficient algorithm that solves the maximum flow/ minimum cut problems of its affinity to parallelization. As the size of graphs grows exponentially, researchers have ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Asymptotically Optimal Scheduling of Multiple Parallelizable Job Classes | Benjamin Berg, Benjamin Moseley, Weina Wang, Mor Harchol-Balter | 2024-03-30 | 下载 | Modern computing workloads are often composed of parallelizable jobs. A parallelizable job can be completed more quickly when run on additional servers. |