2024-07-19

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
LaMAGIC: Language-Model-based Topology Generation for Analog Integrated Circuits	Chen-Chia Chang, Yikang Shen, Shaoze Fan, Jing Li, Shun Zhang, Ningyuan Cao, Yiran Chen, Xin Zhang	2024-07-19	下载	In the realm of electronic and electrical engineering, automation of analog circuit is increasingly vital given the complexity and customized requirements of modern applications.
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference	Joyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis, Sourav Sengupta, Puneet Gupta, Arindam Mallik	2024-07-19	下载	Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world.
Mixed-precision Neural Networks on RISC-V Cores: ISA extensions for Multi-Pumped Soft SIMD Operations	Giorgos Armeniakos, Alexis Maras, Sotirios Xydis, Dimitrios Soudris	2024-07-19	下载	Recent advancements in quantization and mixed-precision approaches offers substantial opportunities to improve the speed and energy efficiency of Neural Networks (NN).
LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks	Ruokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda	2024-07-19	下载	Spiking Neural Networks (SNNs) have gained significant research attention in the last decade due to their potential to drive resource-constrained edge devices.
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs	Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li	2024-07-19	下载	Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference	Joyjit Kundu, Wenzhe Guo, Ali BanaGozar, Udari De Alwis, Sourav Sengupta, Puneet Gupta, Arindam Mallik	2024-07-19	下载	Aligning future system design with the ever-increasing compute needs of large language models (LLMs) is undoubtedly an important problem in today's world.
Regression prediction algorithm for energy consumption regression in cloud computing based on horned lizard algorithm optimised convolutional neural network-bidirectional gated recurrent unit	Feiyang Li, Zinan Cao, Qixuan Yu, Xirui Tang	2024-07-19	下载	For this paper, a prediction study of cloud computing energy consumption was conducted by optimising the data regression algorithm based on the horned lizard optimisation algorithm for Convolutional N...
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service	HamidReza Imani, Abdolah Amirany, Tarek El-Ghazawi	2024-07-19	下载	The increasing demand for deploying large Mixture-of-Experts (MoE) models in resource-constrained environments necessitates efficient approaches to address their high memory and computational requirem...
The Vision of Autonomic Computing: Can LLMs Make It a Reality?	Zhiyang Zhang, Fangkai Yang, Xiaoting Qin, Jue Zhang, Qingwei Lin, Gong Cheng, Dongmei Zhang, Saravan Rajmohan, Qi Zhang	2024-07-19	下载	The Vision of Autonomic Computing (ACV), proposed over two decades ago, envisions computing systems that self-manage akin to biological organisms, adapting seamlessly to changing environments.
On the Impact of PRB Load Uncertainty Forecasting for Sustainable Open RAN	Vaishnavi Kasuluru, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan	2024-07-19	下载	The transition to sustainable Open Radio Access Network (O-RAN) architectures brings new challenges for resource management, especially in predicting the utilization of Physical Resource Block (PRB)s.
Enhancing Cloud-Native Resource Allocation with Probabilistic Forecasting Techniques in O-RAN	Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan, Albert Bel, Angelos Antonopoulos	2024-07-19	下载	The need for intelligent and efficient resource provisioning for the productive management of resources in real-world scenarios is growing with the evolution of telecommunications towards the 6G era.
On the use of Probabilistic Forecasting for Network Analysis in Open RAN	Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan	2024-07-19	下载	Unlike other single-point Artificial Intelligence (AI)-based prediction techniques, such as Long-Short Term Memory (LSTM), probabilistic forecasting techniques (e.g.
FaaS Is Not Enough: Serverless Handling of Burst-Parallel Jobs	Daniel Barcelona-Pons, Aitor Arjona, Pedro García-López, Enrique Molina-Giménez, Stepan Klymonchuk	2024-07-19	下载	Function-as-a-Service (FaaS) struggles with burst-parallel jobs due to needing multiple independent invocations to start a job. The lack of a group invocation primitive complicates application develop...
Theoretical Analysis on Block Time Distributions in Byzantine Fault-Tolerant Consensus Blockchains	Akihiro Fujihara	2024-07-19	下载	Some blockchain networks employ a distributed consensus algorithm featuring Byzantine fault tolerance. Notably, certain public chains, such as Cosmos and Tezos, which operate on a proof-of-stake mecha...
Affinity-aware Serverless Function Scheduling	Giuseppe De Palma, Saverio Giallorenzo, Jacopo Mauro, Matteo Trentin, Gianluigi Zavattaro	2024-07-19	下载	Functions-as-a-Service (FaaS) is a Serverless Cloud paradigm where a platform manages the scheduling (e.g., resource allocation, runtime environments) of stateless functions.
On the Complexity of Reachability Properties in Serverless Function Scheduling	Giuseppe De Palma, Saverio Giallorenzo, Jacopo Mauro, Matteo Trentin, Gianluigi Zavattaro	2024-07-19	下载	Functions-as-a-Service (FaaS) is a Serverless Cloud paradigm where a platform manages the execution scheduling (e.g., resource allocation, runtime environments) of stateless functions.
Where is the Testbed for my Federated Learning Research?	Janez Božič, Amândio R. Faustino, Boris Radovič, Marco Canini, Veljko Pejović	2024-07-19	下载	Progressing beyond centralized AI is of paramount importance, yet, distributed AI solutions, in particular various federated learning (FL) algorithms, are often not comprehensively assessed, which pre...
TorchGT: A Holistic System for Large-scale Graph Transformer Training	Meng Zhang, Jie Sun, Qinghao Hu, Peng Sun, Zeke Wang, Yonggang Wen, Tianwei Zhang	2024-07-19	下载	Graph Transformer is a new architecture that surpasses GNNs in graph learning. While there emerge inspiring algorithm advancements, their practical adoption is still limited, particularly on real-worl...
Stochastic Distance in Property Testing	Uri Meir, Gregory Schwartzman, Yuichi Yoshida	2024-07-19	下载	We introduce a novel concept termed "stochastic distance" for property testing. Diverging from the traditional definition of distance, where a distance $t$ implies that there exist $t$ edges that can ...
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs	Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li	2024-07-19	下载	Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Reasoning About Internet Connectivity	Guillermo Baltra, Tarang Saluja, Yuri Pradkin, John Heidemann	2024-07-19	下载	Innovation in the Internet requires a global Internet core to enable communication between users in ISPs and services in the cloud. Today, this Internet core is challenged by partial reachability: pol...
Routing in Quantum Networks with End-to-End Knowledge	Vinay Kumar, Claudio Cicconetti, Marco Conti, Andrea Passarella	2024-07-19	下载	Given the diverse array of physical systems available for quantum computing and the absence of a well-defined quantum internet protocol stack, the design and optimisation of quantum networking protoco...
On the Impact of PRB Load Uncertainty Forecasting for Sustainable Open RAN	Vaishnavi Kasuluru, Luis Blanco, Cristian J. Vaca-Rubio, Engin Zeydan	2024-07-19	下载	The transition to sustainable Open Radio Access Network (O-RAN) architectures brings new challenges for resource management, especially in predicting the utilization of Physical Resource Block (PRB)s.
Enhancing Cloud-Native Resource Allocation with Probabilistic Forecasting Techniques in O-RAN	Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan, Albert Bel, Angelos Antonopoulos	2024-07-19	下载	The need for intelligent and efficient resource provisioning for the productive management of resources in real-world scenarios is growing with the evolution of telecommunications towards the 6G era.
On the use of Probabilistic Forecasting for Network Analysis in Open RAN	Vaishnavi Kasuluru, Luis Blanco, Engin Zeydan	2024-07-19	下载	Unlike other single-point Artificial Intelligence (AI)-based prediction techniques, such as Long-Short Term Memory (LSTM), probabilistic forecasting techniques (e.g.
Coverage-aware and Reinforcement Learning Using Multi-agent Approach for HD Map QoS in a Realistic Environment	Jeffrey Redondo, Zhenhui Yuan, Nauman Aslam, Juan Zhang	2024-07-19	下载	One effective way to optimize the offloading process is by minimizing the transmission time. This is particularly true in a Vehicular Adhoc Network (VANET) where vehicles frequently download and uploa...
Integrated Push-and-Pull Update Model for Goal-Oriented Effective Communication	Pouya Agheli, Nikolaos Pappas, Petar Popovski, Marios Kountouris	2024-07-19	下载	This paper studies decision-making for goal-oriented effective communication. We consider an end-to-end status update system where a sensing agent (SA) observes a source, generates and transmits updat...
Performance Analysis of Transmission Control Protocol (TCP) Variants	Mansukh Pamarath, Dweep Gogia	2024-07-19	下载	There are various TCP variants such as Reno, Tahoe, Vegas, SACK and so on. These variants implement algorithms that handle congestion control.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Integrating Artificial Intelligence into Operating Systems: A Survey on Techniques, Applications, and Future Directions	Yifan Zhang, Xinkui Zhao, Ziying Li, Guanjie Cheng, Jianwei Yin, Lufei Zhang, Zuoning Chen	2024-07-19	下载	Heterogeneous hardware and dynamic workloads worsen long-standing OS bottlenecks in scalability, adaptability, and manageability. At the same time, advances in machine learning (ML), large language mo...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service	HamidReza Imani, Abdolah Amirany, Tarek El-Ghazawi	2024-07-19	下载	The increasing demand for deploying large Mixture-of-Experts (MoE) models in resource-constrained environments necessitates efficient approaches to address their high memory and computational requirem...
SGDRC: Software-Defined Dynamic Resource Control for Concurrent DNN Inference on NVIDIA GPUs	Yongkang Zhang, Haoxuan Yu, Chenxia Han, Cheng Wang, Baotong Lu, Yunzhe Li, Zhifeng Jiang, Yang Li, Xiaowen Chu, Huaicheng Li	2024-07-19	下载	Cloud service providers heavily colocate high-priority, latency-sensitive (LS), and low-priority, best-effort (BE) DNN inference services on the same GPU to improve resource utilization in data center...