2024-03-24

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Thermal Analysis for NVIDIA GTX480 Fermi GPU Architecture	Savinay Nagendra	2024-03-24	下载	In this project, we design a four-layer (Silicon\|TIM\|Silicon\|TIM), 3D floor plan for NVIDIA GTX480 Fermi GPU architecture and compare heat dissipation and power trends for matrix multiplication and...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network	Yao Kang, Xin Wang, Zhiling Lan	2024-03-24	下载	High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between m...
MRSch: Multi-Resource Scheduling for HPC	Boyang Li, Yuping Fan, Matthew Dearing, Zhiling Lan, Paul Richy, William Allcocky, Michael Papka	2024-03-24	下载	Emerging workloads in high-performance computing (HPC) are embracing significant changes, such as having diverse resource requirements instead of being CPU-centric.
Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling	Boyang Li, Zhiling Lan, Michael E. Papka	2024-03-24	下载	In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promisin...
Study of Workload Interference with Intelligent Routing on Dragonfly	Yao Kang, Xin Wang, Zhiling Lan	2024-03-24	下载	Dragonfly interconnect is a crucial network technology for supercomputers. To support exascale systems, network resources are shared such that links and routers are not dedicated to any node pair.
Arena: Efficiently Training Large Models via Dynamic Scheduling and Adaptive Parallelism Co-Design	Chunyu Xue, Weihao Cui, Quan Chen, Chen Chen, Han Zhao, Shulai Zhang, Linmei Wang, Yan Li, Limin Xiao, Weifeng Zhang, Jing Yang, Bingsheng He, Minyi Guo	2024-03-24	下载	Efficiently training large-scale models (LMs) in GPU clusters involves two separate avenues: inter-job dynamic scheduling and intra-job adaptive parallelism (AP).

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network	Yao Kang, Xin Wang, Zhiling Lan	2024-03-24	下载	High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between m...
Study of Workload Interference with Intelligent Routing on Dragonfly	Yao Kang, Xin Wang, Zhiling Lan	2024-03-24	下载	Dragonfly interconnect is a crucial network technology for supercomputers. To support exascale systems, network resources are shared such that links and routers are not dedicated to any node pair.
Evaluation of Greedy and CBF for ETSI non-area GeoNetworking: The impact of DCC	Oscar Amador, Maria Calderon, Manuel Urueña, Ignacio Soto	2024-03-24	下载	This paper evaluates the performance of the two ETSI non-area forwarding algorithms in the GeoNetworking specification: Greedy Forwarding and Non-Area Contention-Based Forwarding (CBF).
Interference Management for Integrated Sensing and Communication Systems: A Survey	Yangyang Niu, Zhiqing Wei, Lin Wang, Huici Wu, Zhiyong Feng	2024-03-24	下载	Emerging applications such as autonomous driving and Internet of things (IoT) services put forward the demand for simutaneous sensing and communication functions in the same system.
Digital Twin Assisted Intelligent Network Management for Vehicular Applications	Kaige Qu, Weihua Zhuang	2024-03-24	下载	The emerging data-driven methods based on artificial intelligence (AI) have paved the way for intelligent, flexible, and adaptive network management in vehicular applications.
Prioritized Multi-Tenant Traffic Engineering for Dynamic QoS Provisioning in Autonomous SDN-OpenFlow Edge Networks	Mohammad Sajid Shahriar, Faisal Ahmed, Genshe Chen, Khanh D. Pham, Suresh Subramaniam, Motoharu Matsuura, Hiroshi Hasegawa, Shih-Chun Lin	2024-03-24	下载	This letter indicates the critical need for prioritized multi-tenant quality-of-service (QoS) management by emerging mobile edge systems, particularly for high-throughput beyond fifth-generation netwo...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Explainable Port Mapping Inference with Sparse Performance Counters for AMD's Zen Architectures	Fabian Ritter, Sebastian Hack	2024-03-24	下载	Performance models are instrumental for optimizing performance-sensitive code. When modeling the use of functional units of out-of-order x86-64 CPUs, data availability varies by the manufacturer: Inst...
Performance evaluation of accelerated complex multiple-precision LU decomposition	Tomonori Kouya	2024-03-24	下载	The direct method is one of the most important algorithms for solving linear systems of equations, with LU decomposition comprising a significant portion of its computation time.
MicroHD: An Accuracy-Driven Optimization of Hyperdimensional Computing Algorithms for TinyML systems	Flavio Ponzina, Tajana Rosing	2024-03-24	下载	Hyperdimensional computing (HDC) is emerging as a promising AI approach that can effectively target TinyML applications thanks to its lightweight computing and memory requirements.