Appearance
2024-07-19
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits | Chen-Chia Chang, Yikang Shen, Shaoze Fan, Jing Li, Shun Zhang, Ningyuan Cao, Yiran Chen, Xin Zhang | 2024-07-19 | 下载 | In the realm of electronic and electrical engineering, automation of analog circuit is increasingly vital given the complexity and customized requirements of modern applications. |
| Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference | Joyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis, Sourav Sengupta, Puneet Gupta, Arindam Mallik | 2024-07-19 | 下载 | Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world. |
| Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD Operations | Giorgos Armeniakos, Alexis Maras, Sotirios Xydis, Dimitrios Soudris | 2024-07-19 | 下载 | Recent advancements in quantization and mixed-precision approaches offers substantial opportunities to improve the speed and energy efficiency of Neural Networks (NN). |
| LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks | Ruokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda | 2024-07-19 | 下载 | Spiking Neural Networks (SNNs) have gained significant research attention in the last decade due to their potential to drive resource-constrained edge devices. |
| SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs | Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li | 2024-07-19 | 下载 | Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference | Joyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis, Sourav Sengupta, Puneet Gupta, Arindam Mallik | 2024-07-19 | 下载 | Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world. |
| Regression prediction algorithm for energy consumption regression in cloud computing based on horned lizard algorithm optimised convolutional neural network-bidirectional gated recurrent unit | Feiyang Li, Zinan Cao, Qixuan Yu, Xirui Tang | 2024-07-19 | 下载 | For this paper, a prediction study of cloud computing energy consumption was conducted by optimising the data regression algorithm based on the horned lizard optimisation algorithm for Convolutional N... |
| Mixture of Experts with Mixture of Precisions for Tuning Quality of Service | HamidReza Imani, Abdolah Amirany, Tarek El-Ghazawi | 2024-07-19 | 下载 | The increasing demand for deploying large Mixture-of-Experts (MoE) models in resource-constrained environments necessitates efficient approaches to address their high memory and computational requirem... |
| The Vision of Autonomic Computing: Can LLMs Make It a Reality? | Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Jue Zhang, Qingwei Lin, Gong Cheng, Dongmei Zhang, Saravan Rajmohan, Qi Zhang | 2024-07-19 | 下载 | The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments. |
| On the Impact of PRB Load Uncertainty Forecasting for Sustainable Open RAN | Vaishnavi Kasuluru, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan | 2024-07-19 | 下载 | The transition to sustainable Open Radio Access Network (O-RAN) architectures brings new challenges for resource management, especially in predicting the utilization of Physical Resource Block (PRB)s. |
| Enhancing Cloud-Native Resource Allocation with Probabilistic Forecasting Techniques in O-RAN | Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan, Albert Bel, Angelos Antonopoulos | 2024-07-19 | 下载 | The need for intelligent and efficient resource provisioning for the productive management of resources in real-world scenarios is growing with the evolution of telecommunications towards the 6G era. |
| On the use of Probabilistic Forecasting for Network Analysis in Open RAN | Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan | 2024-07-19 | 下载 | Unlike other single-point Artificial Intelligence (AI)-based prediction techniques, such as Long-Short Term Memory (LSTM), probabilistic forecasting techniques (e.g. |
| FaaS Is Not Enough: Serverless Handling of Burst-Parallel Jobs | Daniel Barcelona-Pons, Aitor Arjona, Pedro García-López, Enrique Molina-Giménez, Stepan Klymonchuk | 2024-07-19 | 下载 | Function-as-a-Service (FaaS) struggles with burst-parallel jobs due to needing multiple independent invocations to start a job. The lack of a group invocation primitive complicates application develop... |
| Theoretical Analysis on Block Time Distributions in Byzantine Fault-Tolerant Consensus Blockchains | Akihiro Fujihara | 2024-07-19 | 下载 | Some blockchain networks employ a distributed consensus algorithm featuring Byzantine fault tolerance. Notably, certain public chains, such as Cosmos and Tezos, which operate on a proof-of-stake mecha... |
| Affinity-aware Serverless Function Scheduling | Giuseppe De Palma, Saverio Giallorenzo, Jacopo Mauro, Matteo Trentin, Gianluigi Zavattaro | 2024-07-19 | 下载 | Functions-as-a-Service (FaaS) is a Serverless Cloud paradigm where a platform manages the scheduling (e.g., resource allocation, runtime environments) of stateless functions. |
| On the Complexity of Reachability Properties in Serverless Function Scheduling | Giuseppe De Palma, Saverio Giallorenzo, Jacopo Mauro, Matteo Trentin, Gianluigi Zavattaro | 2024-07-19 | 下载 | Functions-as-a-Service (FaaS) is a Serverless Cloud paradigm where a platform manages the execution scheduling (e.g., resource allocation, runtime environments) of stateless functions. |
| Where is the Testbed for my Federated Learning Research? | Janez Božič, Amândio R. Faustino, Boris Radovič, Marco Canini, Veljko Pejović | 2024-07-19 | 下载 | Progressing beyond centralized AI is of paramount importance, yet, distributed AI solutions, in particular various federated learning (FL) algorithms, are often not comprehensively assessed, which pre... |
| TorchGT: A Holistic System for Large-scale Graph Transformer Training | Meng Zhang, Jie Sun, Qinghao Hu, Peng Sun, Zeke Wang, Yonggang Wen, Tianwei Zhang | 2024-07-19 | 下载 | Graph Transformer is a new architecture that surpasses GNNs in graph learning. While there emerge inspiring algorithm advancements, their practical adoption is still limited, particularly on real-worl... |
| Stochastic Distance in Property Testing | Uri Meir, Gregory Schwartzman, Yuichi Yoshida | 2024-07-19 | 下载 | We introduce a novel concept termed "stochastic distance" for property testing. Diverging from the traditional definition of distance, where a distance implies that there exist edges that can ... |
| SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs | Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li | 2024-07-19 | 下载 | Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Reasoning About Internet Connectivity | Guillermo Baltra, Tarang Saluja, Yuri Pradkin, John Heidemann | 2024-07-19 | 下载 | Innovation in the Internet requires a global Internet core to enable communication between users in ISPs and services in the cloud. Today, this Internet core is challenged by partial reachability: pol... |
| Routing in Quantum Networks with End-to-End Knowledge | Vinay Kumar, Claudio Cicconetti, Marco Conti, Andrea Passarella | 2024-07-19 | 下载 | Given the diverse array of physical systems available for quantum computing and the absence of a well-defined quantum internet protocol stack, the design and optimisation of quantum networking protoco... |
| On the Impact of PRB Load Uncertainty Forecasting for Sustainable Open RAN | Vaishnavi Kasuluru, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan | 2024-07-19 | 下载 | The transition to sustainable Open Radio Access Network (O-RAN) architectures brings new challenges for resource management, especially in predicting the utilization of Physical Resource Block (PRB)s. |
| Enhancing Cloud-Native Resource Allocation with Probabilistic Forecasting Techniques in O-RAN | Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan, Albert Bel, Angelos Antonopoulos | 2024-07-19 | 下载 | The need for intelligent and efficient resource provisioning for the productive management of resources in real-world scenarios is growing with the evolution of telecommunications towards the 6G era. |
| On the use of Probabilistic Forecasting for Network Analysis in Open RAN | Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan | 2024-07-19 | 下载 | Unlike other single-point Artificial Intelligence (AI)-based prediction techniques, such as Long-Short Term Memory (LSTM), probabilistic forecasting techniques (e.g. |
| Coverage-aware and Reinforcement Learning Using Multi-agent Approach for HD Map QoS in a Realistic Environment | Jeffrey Redondo, Zhenhui Yuan, Nauman Aslam, Juan Zhang | 2024-07-19 | 下载 | One effective way to optimize the offloading process is by minimizing the transmission time. This is particularly true in a Vehicular Adhoc Network (VANET) where vehicles frequently download and uploa... |
| Integrated Push-and-Pull Update Model for Goal-Oriented Effective Communication | Pouya Agheli, Nikolaos Pappas, Petar Popovski, Marios Kountouris | 2024-07-19 | 下载 | This paper studies decision-making for goal-oriented effective communication. We consider an end-to-end status update system where a sensing agent (SA) observes a source, generates and transmits updat... |
| Performance Analysis of Transmission Control Protocol (TCP) Variants | Mansukh Pamarath, Dweep Gogia | 2024-07-19 | 下载 | There are various TCP variants such as Reno, Tahoe, Vegas, SACK and so on. These variants implement algorithms that handle congestion control. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Integrating Artificial Intelligence into Operating Systems: A Survey on Techniques, Applications, and Future Directions | Yifan Zhang, Xinkui Zhao, Ziying Li, Guanjie Cheng, Jianwei Yin, Lufei Zhang, Zuoning Chen | 2024-07-19 | 下载 | Heterogeneous hardware and dynamic workloads worsen long-standing OS bottlenecks in scalability, adaptability, and manageability. At the same time, advances in machine learning (ML), large language mo... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Mixture of Experts with Mixture of Precisions for Tuning Quality of Service | HamidReza Imani, Abdolah Amirany, Tarek El-Ghazawi | 2024-07-19 | 下载 | The increasing demand for deploying large Mixture-of-Experts (MoE) models in resource-constrained environments necessitates efficient approaches to address their high memory and computational requirem... |
| SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs | Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li | 2024-07-19 | 下载 | Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center... |