Skip to content

2024-07-19

cs.AR - Architecture

标题作者发布日期PDF摘要
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated CircuitsChen-Chia Chang, Yikang Shen, Shaoze Fan, Jing Li, Shun Zhang, Ningyuan Cao, Yiran Chen, Xin Zhang2024-07-19下载In the realm of electronic and electrical engineering, automation of analog circuit is increasingly vital given the complexity and customized requirements of modern applications.
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and InferenceJoyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis, Sourav Sengupta, Puneet Gupta, Arindam Mallik2024-07-19下载Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world.
Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD OperationsGiorgos Armeniakos, Alexis Maras, Sotirios Xydis, Dimitrios Soudris2024-07-19下载Recent advancements in quantization and mixed-precision approaches offers substantial opportunities to improve the speed and energy efficiency of Neural Networks (NN).
LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural NetworksRuokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda2024-07-19下载Spiking Neural Networks (SNNs) have gained significant research attention in the last decade due to their potential to drive resource-constrained edge devices.
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUsYongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li2024-07-19下载Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and InferenceJoyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis, Sourav Sengupta, Puneet Gupta, Arindam Mallik2024-07-19下载Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world.
Regression prediction algorithm for energy consumption regression in cloud computing based on horned lizard algorithm optimised convolutional neural network-bidirectional gated recurrent unitFeiyang Li, Zinan Cao, Qixuan Yu, Xirui Tang2024-07-19下载For this paper, a prediction study of cloud computing energy consumption was conducted by optimising the data regression algorithm based on the horned lizard optimisation algorithm for Convolutional N...
Mixture of Experts with Mixture of Precisions for Tuning Quality of ServiceHamidReza Imani, Abdolah Amirany, Tarek El-Ghazawi2024-07-19下载The increasing demand for deploying large Mixture-of-Experts (MoE) models in resource-constrained environments necessitates efficient approaches to address their high memory and computational requirem...
The Vision of Autonomic Computing: Can LLMs Make It a Reality?Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Jue Zhang, Qingwei Lin, Gong Cheng, Dongmei Zhang, Saravan Rajmohan, Qi Zhang2024-07-19下载The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments.
On the Impact of PRB Load Uncertainty Forecasting for Sustainable Open RANVaishnavi Kasuluru, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan2024-07-19下载The transition to sustainable Open Radio Access Network (O-RAN) architectures brings new challenges for resource management, especially in predicting the utilization of Physical Resource Block (PRB)s.
Enhancing Cloud-Native Resource Allocation with Probabilistic Forecasting Techniques in O-RANVaishnavi Kasuluru, Luis Blanco, Engin Zeydan, Albert Bel, Angelos Antonopoulos2024-07-19下载The need for intelligent and efficient resource provisioning for the productive management of resources in real-world scenarios is growing with the evolution of telecommunications towards the 6G era.
On the use of Probabilistic Forecasting for Network Analysis in Open RANVaishnavi Kasuluru, Luis Blanco, Engin Zeydan2024-07-19下载Unlike other single-point Artificial Intelligence (AI)-based prediction techniques, such as Long-Short Term Memory (LSTM), probabilistic forecasting techniques (e.g.
FaaS Is Not Enough: Serverless Handling of Burst-Parallel JobsDaniel Barcelona-Pons, Aitor Arjona, Pedro García-López, Enrique Molina-Giménez, Stepan Klymonchuk2024-07-19下载Function-as-a-Service (FaaS) struggles with burst-parallel jobs due to needing multiple independent invocations to start a job. The lack of a group invocation primitive complicates application develop...
Theoretical Analysis on Block Time Distributions in Byzantine Fault-Tolerant Consensus BlockchainsAkihiro Fujihara2024-07-19下载Some blockchain networks employ a distributed consensus algorithm featuring Byzantine fault tolerance. Notably, certain public chains, such as Cosmos and Tezos, which operate on a proof-of-stake mecha...
Affinity-aware Serverless Function SchedulingGiuseppe De Palma, Saverio Giallorenzo, Jacopo Mauro, Matteo Trentin, Gianluigi Zavattaro2024-07-19下载Functions-as-a-Service (FaaS) is a Serverless Cloud paradigm where a platform manages the scheduling (e.g., resource allocation, runtime environments) of stateless functions.
On the Complexity of Reachability Properties in Serverless Function SchedulingGiuseppe De Palma, Saverio Giallorenzo, Jacopo Mauro, Matteo Trentin, Gianluigi Zavattaro2024-07-19下载Functions-as-a-Service (FaaS) is a Serverless Cloud paradigm where a platform manages the execution scheduling (e.g., resource allocation, runtime environments) of stateless functions.
Where is the Testbed for my Federated Learning Research?Janez Božič, Amândio R. Faustino, Boris Radovič, Marco Canini, Veljko Pejović2024-07-19下载Progressing beyond centralized AI is of paramount importance, yet, distributed AI solutions, in particular various federated learning (FL) algorithms, are often not comprehensively assessed, which pre...
TorchGT: A Holistic System for Large-scale Graph Transformer TrainingMeng Zhang, Jie Sun, Qinghao Hu, Peng Sun, Zeke Wang, Yonggang Wen, Tianwei Zhang2024-07-19下载Graph Transformer is a new architecture that surpasses GNNs in graph learning. While there emerge inspiring algorithm advancements, their practical adoption is still limited, particularly on real-worl...
Stochastic Distance in Property TestingUri Meir, Gregory Schwartzman, Yuichi Yoshida2024-07-19下载We introduce a novel concept termed "stochastic distance" for property testing. Diverging from the traditional definition of distance, where a distance tt implies that there exist tt edges that can ...
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUsYongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li2024-07-19下载Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Reasoning About Internet ConnectivityGuillermo Baltra, Tarang Saluja, Yuri Pradkin, John Heidemann2024-07-19下载Innovation in the Internet requires a global Internet core to enable communication between users in ISPs and services in the cloud. Today, this Internet core is challenged by partial reachability: pol...
Routing in Quantum Networks with End-to-End KnowledgeVinay Kumar, Claudio Cicconetti, Marco Conti, Andrea Passarella2024-07-19下载Given the diverse array of physical systems available for quantum computing and the absence of a well-defined quantum internet protocol stack, the design and optimisation of quantum networking protoco...
On the Impact of PRB Load Uncertainty Forecasting for Sustainable Open RANVaishnavi Kasuluru, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan2024-07-19下载The transition to sustainable Open Radio Access Network (O-RAN) architectures brings new challenges for resource management, especially in predicting the utilization of Physical Resource Block (PRB)s.
Enhancing Cloud-Native Resource Allocation with Probabilistic Forecasting Techniques in O-RANVaishnavi Kasuluru, Luis Blanco, Engin Zeydan, Albert Bel, Angelos Antonopoulos2024-07-19下载The need for intelligent and efficient resource provisioning for the productive management of resources in real-world scenarios is growing with the evolution of telecommunications towards the 6G era.
On the use of Probabilistic Forecasting for Network Analysis in Open RANVaishnavi Kasuluru, Luis Blanco, Engin Zeydan2024-07-19下载Unlike other single-point Artificial Intelligence (AI)-based prediction techniques, such as Long-Short Term Memory (LSTM), probabilistic forecasting techniques (e.g.
Coverage-aware and Reinforcement Learning Using Multi-agent Approach for HD Map QoS in a Realistic EnvironmentJeffrey Redondo, Zhenhui Yuan, Nauman Aslam, Juan Zhang2024-07-19下载One effective way to optimize the offloading process is by minimizing the transmission time. This is particularly true in a Vehicular Adhoc Network (VANET) where vehicles frequently download and uploa...
Integrated Push-and-Pull Update Model for Goal-Oriented Effective CommunicationPouya Agheli, Nikolaos Pappas, Petar Popovski, Marios Kountouris2024-07-19下载This paper studies decision-making for goal-oriented effective communication. We consider an end-to-end status update system where a sensing agent (SA) observes a source, generates and transmits updat...
Performance Analysis of Transmission Control Protocol (TCP) VariantsMansukh Pamarath, Dweep Gogia2024-07-19下载There are various TCP variants such as Reno, Tahoe, Vegas, SACK and so on. These variants implement algorithms that handle congestion control.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Integrating Artificial Intelligence into Operating Systems: A Survey on Techniques, Applications, and Future DirectionsYifan Zhang, Xinkui Zhao, Ziying Li, Guanjie Cheng, Jianwei Yin, Lufei Zhang, Zuoning Chen2024-07-19下载Heterogeneous hardware and dynamic workloads worsen long-standing OS bottlenecks in scalability, adaptability, and manageability. At the same time, advances in machine learning (ML), large language mo...

cs.PF - Performance

标题作者发布日期PDF摘要
Mixture of Experts with Mixture of Precisions for Tuning Quality of ServiceHamidReza Imani, Abdolah Amirany, Tarek El-Ghazawi2024-07-19下载The increasing demand for deploying large Mixture-of-Experts (MoE) models in resource-constrained environments necessitates efficient approaches to address their high memory and computational requirem...
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUsYongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li2024-07-19下载Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center...

基于 VitePress 构建