2024-11-28

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Stoch-IMC: A Bit-Parallel Stochastic In-Memory Computing Architecture Based on STT-MRAM	Amir M. Hajisadeghi, Hamid R. Zarandi, Mahmoud Momtazpour	2024-11-28	下载	In-memory computing (IMC) offloads parts of the computations to memory to fulfill the performance and energy demands of applications such as neuromorphic computing, machine learning, and image process...
PREBA: A Hardware/Software Co-Design for Multi-Instance GPU based AI Inference Servers	Gwangoo Yeo, Jiin Kim, Yujeong Choi, Minsoo Rhu	2024-11-28	下载	NVIDIA's Multi-Instance GPU (MIG) is a feature that enables system designers to reconfigure one large GPU into multiple smaller GPU slices. This work characterizes this emerging GPU and evaluates its ...
DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous Agent	Chen-Chia Chang, Chia-Tung Ho, Yaguang Li, Yiran Chen, Haoxing Ren	2024-11-28	下载	In the advanced technology nodes, the integrated design rule checker (DRC) is often utilized in place and route tools for fast optimization loops for power-performance-area.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Marconi: Prefix Caching for the Era of Hybrid LLMs	Rui Pan, Zhuang Wang, Zhen Jia, Can Karakus, Luca Zancato, Tri Dao, Yida Wang, Ravi Netravali	2024-11-28	下载	Hybrid models that combine the language modeling capabilities of Attention layers with the efficiency of Recurrent layers (e.g., State Space Models) have gained traction in practically supporting long...
Strong Linearizability without Compare&Swap: The Case of Bags	Faith Ellen, Gal Sela	2024-11-28	下载	Because strongly-linearizable objects provide stronger guarantees than linearizability, they serve as valuable building blocks for the design of concurrent data structures.
PREBA: A Hardware/Software Co-Design for Multi-Instance GPU based AI Inference Servers	Gwangoo Yeo, Jiin Kim, Yujeong Choi, Minsoo Rhu	2024-11-28	下载	NVIDIA's Multi-Instance GPU (MIG) is a feature that enables system designers to reconfigure one large GPU into multiple smaller GPU slices. This work characterizes this emerging GPU and evaluates its ...
Carbon-Aware Quality Adaptation for Energy-Intensive Services	Philipp Wiesner, Dennis Grinwald, Philipp Weiß, Patrick Wilhelm, Ramin Khalili, Odej Kao	2024-11-28	下载	The energy demand of modern cloud services, particularly those related to generative AI, is increasing at an unprecedented pace. To date, carbon-aware computing strategies have primarily focused on ba...
Many Hands Make Light Work: Accelerating Edge Inference via Multi-Client Collaborative Caching	Wenyi Liang, Jianchun Liu, Hongli Xu, Chunming Qiao, Liusheng Huang	2024-11-28	下载	Edge inference is a technology that enables real-time data processing and analysis on clients near the data source. To ensure compliance with the Service-Level Objectives (SLOs), such as a 30% latency...
Unified schemes for directive-based GPU offloading	Yohei Miki, Toshihiro Hanawa	2024-11-28	下载	GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes origi...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Towards an Implementation of the Knowledge-Based Control Plane for Intelligent Swarm Networks	Xuanchi Guo, Anh Le-Tuan, Danh Le-Phuoc	2024-11-28	下载	This paper proposes the possibility of integrating Dynamic Knowledge Graph (DKG) with Software-Defined Networking (SDN). This new approach aims to assist the management and control capabilities of the...
Machine Learning for Spectrum Sharing: A Survey	Francisco R. V. Guimarães, José Mairton B. da Silva, Charles Casimiro Cavalcante, Gabor Fodor, Mats Bengtsson, Carlo Fischione	2024-11-28	下载	The 5th generation (5G) of wireless systems is being deployed with the aim to provide many sets of wireless communication services, such as low data rates for a massive amount of devices, broadband, l...
ReputeStream: Mitigating Free-Riding through Reputation-Based Multi-Layer P2P Live Streaming	Rashmi Kushwaha, Rahul Bhattacharyya, Yatindra Nath Singh	2024-11-28	下载	This paper presents a novel algorithm for peer-to-peer (P2P) live streaming that addresses the limitations of single-layer systems through a multi-layered approach.
Traffic-cognitive Slicing for Resource-efficient Offloading with Dual-distillation DRL in Multi-edge Systems	Ting Xiaoyang, Minfeng Zhang, Saimin Chen Zhang	2024-11-28	下载	In edge computing, emerging network slicing and computation offloading can support Edge Service Providers (ESPs) better handling diverse distributions of user requests, to improve Quality-of-Service (...
Perspectives on 6G Architectures	Rainer Liebhart, Mansoor Shafi, Harsh Tataria, Gajan Shivanandan, Devaki Chandramouli	2024-11-28	下载	Mobile communications have been undergoing a generational change every ten years. While 5G network deployments are maturing, significant efforts are being made to standardize 6G, which is expected to ...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
An Integrated Artificial Intelligence Operating System for Advanced Low-Altitude Aviation Applications	Minzhe Tan, Xinlin Fan, Jian He, Yi Hou, Zhan Liu, Yaopeng Jiang, Y. M. Jiang	2024-11-28	下载	This paper introduces a high-performance artificial intelligence operating system tailored for low-altitude aviation, designed to address key challenges such as real-time task execution, computational...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Unified schemes for directive-based GPU offloading	Yohei Miki, Toshihiro Hanawa	2024-11-28	下载	GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes origi...
Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation Approach	Yijia Zhang, Zhihong Gou, Shijie Cao, Weigang Feng, Sicheng Zhang, Guohao Dai, Ningyi Xu	2024-11-28	下载	Deep Neural Networks (DNNs) have revolutionized various fields, but their deployment on GPUs often leads to significant energy consumption. Unlike existing methods for reducing GPU energy consumption,...