Appearance
2024-09-20
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space | Michael Gilbert, Yannan Nellie Wu, Joel S. Emer, Vivienne Sze | 2024-09-20 | 下载 | Latency and energy consumption are key metrics in the performance of deep neural network (DNN) accelerators. A significant factor contributing to latency and energy is data transfers. |
| RapidOMS: FPGA-based Open Modification Spectral Library Searching with HD Computing | Sumukh Pinge, Weihong Xu, Wout Bittremieux, Niema Moshiri, Sang-Woo Jun, Tajana Rosing | 2024-09-20 | 下载 | Mass spectrometry (MS) is essential for protein analysis but faces significant challenges with large datasets and complex post-translational modifications, resulting in difficulties in spectral identi... |
| Towards Efficient Neuro-Symbolic AI: From Workload Characterization to Hardware Architecture | Zishen Wan, Che-Kai Liu, Hanchen Yang, Ritik Raj, Chaojian Li, Haoran You, Yonggan Fu, Cheng Wan, Sixu Li, Youbin Kim, Ananda Samajdar, Yingyan Celine Lin, Mohamed Ibrahim, Jan M. Rabaey, Tushar Krishna, Arijit Raychowdhury | 2024-09-20 | 下载 | The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, are facing challenges surrounding unsustainable computational trajectories, limited robustness, a... |
| Learning to Compare Hardware Designs for High-Level Synthesis | Yunsheng Bai, Atefeh Sohrabizadeh, Zijian Ding, Rongjian Liang, Weikai Li, Ding Wang, Haoxing Ren, Yizhou Sun, Jason Cong | 2024-09-20 | 下载 | High-level synthesis (HLS) is an automated design process that transforms high-level code into hardware designs, enabling the rapid development of hardware accelerators. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| DP-FedSAM: Enhancing Differentially Private Federated Learning Through Personalized Sharpness-Aware Minimization | Zhenxiao Zhang, Yuanxiong Guo, Yanmin Gong | 2024-09-20 | 下载 | Federated learning (FL) is a distributed machine learning approach that allows multiple clients to collaboratively train a model without sharing their raw data. |
| End-Cloud Collaboration Framework for Advanced AI Customer Service in E-commerce | Liangyu Teng, Yang Liu, Jing Liu, Liang Song | 2024-09-20 | 下载 | In recent years, the e-commerce industry has seen a rapid increase in the demand for advanced AI-driven customer service solutions. Traditional cloud-based models face limitations in terms of latency,... |
| SatFed: A Resource-Efficient LEO Satellite-Assisted Heterogeneous Federated Learning Framework | Yuxin Zhang, Zheng Lin, Zhe Chen, Zihan Fang, Wenjun Zhu, Xianhao Chen, Jin Zhao, Yue Gao | 2024-09-20 | 下载 | Traditional federated learning (FL) frameworks rely heavily on terrestrial networks, where coverage limitations and increasing bandwidth congestion significantly hinder model convergence. |
| Local problems in trees across a wide range of distributed models | Anubhav Dhar, Eli Kujawa, Henrik Lievonen, Augusto Modanese, Mikail Muftuoglu, Jan Studený, Jukka Suomela | 2024-09-20 | 下载 | The randomized online-LOCAL model captures a number of models of computing; it is at least as strong as all of these models: - the classical LOCAL model of distributed graph algorithms, - the quan... |
| Noise-Robust and Resource-Efficient ADMM-based Federated Learning | Ehsan Lari, Reza Arablouei, Vinay Chakravarthi Gogineni, Stefan Werner | 2024-09-20 | 下载 | Federated learning (FL) leverages client-server communications to train global models on decentralized data. However, communication noise or errors can impair model accuracy. |
| RapidOMS: FPGA-based Open Modification Spectral Library Searching with HD Computing | Sumukh Pinge, Weihong Xu, Wout Bittremieux, Niema Moshiri, Sang-Woo Jun, Tajana Rosing | 2024-09-20 | 下载 | Mass spectrometry (MS) is essential for protein analysis but faces significant challenges with large datasets and complex post-translational modifications, resulting in difficulties in spectral identi... |
| Flexible Swapping for the Cloud | Milan Pandurov, Lukas Humbel, Dmitry Sepp, Adamos Ttofari, Leon Thomm, Do Le Quoc, Siddharth Chandrasekaran, Sharan Santhanam, Chuan Ye, Shai Bergman, Wei Wang, Sven Lundgren, Konstantinos Sagonas, Alberto Ros | 2024-09-20 | 下载 | Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. |
| Performance Enhancement of the Ozaki Scheme on Integer Matrix Multiplication Unit | Yuki Uchino, Katsuhisa Ozaki, Toshiyuki Imamura | 2024-09-20 | 下载 | This study was aimed at simultaneously achieving sufficient accuracy and high performance for general matrix multiplications. Recent architectures, such as NVIDIA GPUs, feature high-performance units ... |
| Optimizing RLHF Training for Large Language Models with Stage Fusion | Yinmin Zhong, Zili Zhang, Bingyang Wu, Shengyu Liu, Yukun Chen, Changyi Wan, Hanpeng Hu, Lei Xia, Ranchen Ming, Yibo Zhu, Xin Jin | 2024-09-20 | 下载 | We present RLHFuse, an efficient training system with stage fusion for Reinforcement Learning from Human Feedback (RLHF). Due to the intrinsic nature of RLHF training, i.e. |
| Stabl: Blockchain Fault Tolerance | Vincent Gramoli, Rachid Guerraoui, Andrei Lebedev, Gauthier Voron | 2024-09-20 | 下载 | Blockchain promises to make online services more fault tolerant due to their inherent distributed nature. Their ability to execute arbitrary programs in different geo-distributed regions and on divers... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Centrality Approach to Select Offloading Data Aggregation Points in Vehicular Sensor Networks | Douglas Moura, Geymerson S. Ramos, Andre L. L. Aquino, Antonio Loureiro | 2024-09-20 | 下载 | This work proposes a centrality-based approach to identify data offloading points in a VSN. The solution presents a scheme to select vehicles used as aggregation points to collect and aggregate other ... |
| Efficient Entanglement Routing for Satellite-Aerial-Terrestrial Quantum Networks | Yu Zhang, Yanmin Gong, Lei Fan, Yu Wang, Zhu Han, Yuanxiong Guo | 2024-09-20 | 下载 | In the era of 6G and beyond, space-aerial-terrestrial quantum networks (SATQNs) are shaping the future of the global-scale quantum Internet. This paper investigates the collaboration among satellite, ... |
| Quantum-Assisted Joint Virtual Network Function Deployment and Maximum Flow Routing for Space Information Networks | Yu Zhang, Yanmin Gong, Lei Fan, Yu Wang, Zhu Han, Yuanxiong Guo | 2024-09-20 | 下载 | Network function virtualization (NFV)-enabled space information network (SIN) has emerged as a promising method to facilitate global coverage and seamless service. |
| Integrating Deterministic Networking with 5G | Yash Deshpande, Philip Diederich, Muhamad Luthfi, Laura Becker, José Fontalvo-Hernández, Wolfgang Kellerer | 2024-09-20 | 下载 | The rising prevalence of real-time applications that require deterministic communication over mobile networks necessitates the joint operation of both mobile and fixed network components. |
| Post-Quantum Cryptography Anonymous Scheme -- PQCWC: Post-Quantum Cryptography Winternitz-Chen | Abel C. H. Chen | 2024-09-20 | 下载 | As quantum computing technology matures, it poses a threat to the security of mainstream asymmetric cryptographic methods. In response, the National Institute of Standards and Technology released the ... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Flexible Swapping for the Cloud | Milan Pandurov, Lukas Humbel, Dmitry Sepp, Adamos Ttofari, Leon Thomm, Do Le Quoc, Siddharth Chandrasekaran, Sharan Santhanam, Chuan Ye, Shai Bergman, Wei Wang, Sven Lundgren, Konstantinos Sagonas, Alberto Ros | 2024-09-20 | 下载 | Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| RAVE: RISC-V Analyzer of Vector Executions, a QEMU tracing plugin | Pablo Vizcaino, Filippo Mantovani, Jesus Labarta, Roger Ferrer | 2024-09-20 | 下载 | Simulators are crucial during the development of a chip, like the RISC-V accelerator designed in the European Processor Initiative project. In this paper, we showcase the limitations of the current si... |
| Stabl: Blockchain Fault Tolerance | Vincent Gramoli, Rachid Guerraoui, Andrei Lebedev, Gauthier Voron | 2024-09-20 | 下载 | Blockchain promises to make online services more fault tolerant due to their inherent distributed nature. Their ability to execute arbitrary programs in different geo-distributed regions and on divers... |