Skip to content

2024-11-28

cs.AR - Architecture

标题作者发布日期PDF摘要
Stoch-IMC: A Bit-Parallel Stochastic In-Memory Computing Architecture Based on STT-MRAMAmir M. Hajisadeghi, Hamid R. Zarandi, Mahmoud Momtazpour2024-11-28下载In-memory computing (IMC) offloads parts of the computations to memory to fulfill the performance and energy demands of applications such as neuromorphic computing, machine learning, and image process...
PREBA: A Hardware/Software Co-Design for Multi-Instance GPU based AI Inference ServersGwangoo Yeo, Jiin Kim, Yujeong Choi, Minsoo Rhu2024-11-28下载NVIDIA's Multi-Instance GPU (MIG) is a feature that enables system designers to reconfigure one large GPU into multiple smaller GPU slices. This work characterizes this emerging GPU and evaluates its ...
DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous AgentChen-Chia Chang, Chia-Tung Ho, Yaguang Li, Yiran Chen, Haoxing Ren2024-11-28下载In the advanced technology nodes, the integrated design rule checker (DRC) is often utilized in place and route tools for fast optimization loops for power-performance-area.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Marconi: Prefix Caching for the Era of Hybrid LLMsRui Pan, Zhuang Wang, Zhen Jia, Can Karakus, Luca Zancato, Tri Dao, Yida Wang, Ravi Netravali2024-11-28下载Hybrid models that combine the language modeling capabilities of Attention layers with the efficiency of Recurrent layers (e.g., State Space Models) have gained traction in practically supporting long...
Strong Linearizability without Compare&Swap: The Case of BagsFaith Ellen, Gal Sela2024-11-28下载Because strongly-linearizable objects provide stronger guarantees than linearizability, they serve as valuable building blocks for the design of concurrent data structures.
PREBA: A Hardware/Software Co-Design for Multi-Instance GPU based AI Inference ServersGwangoo Yeo, Jiin Kim, Yujeong Choi, Minsoo Rhu2024-11-28下载NVIDIA's Multi-Instance GPU (MIG) is a feature that enables system designers to reconfigure one large GPU into multiple smaller GPU slices. This work characterizes this emerging GPU and evaluates its ...
Carbon-Aware Quality Adaptation for Energy-Intensive ServicesPhilipp Wiesner, Dennis Grinwald, Philipp Weiß, Patrick Wilhelm, Ramin Khalili, Odej Kao2024-11-28下载The energy demand of modern cloud services, particularly those related to generative AI, is increasing at an unprecedented pace. To date, carbon-aware computing strategies have primarily focused on ba...
Many Hands Make Light Work: Accelerating Edge Inference via Multi-Client Collaborative CachingWenyi Liang, Jianchun Liu, Hongli Xu, Chunming Qiao, Liusheng Huang2024-11-28下载Edge inference is a technology that enables real-time data processing and analysis on clients near the data source. To ensure compliance with the Service-Level Objectives (SLOs), such as a 30% latency...
Unified schemes for directive-based GPU offloadingYohei Miki, Toshihiro Hanawa2024-11-28下载GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes origi...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Towards an Implementation of the Knowledge-Based Control Plane for Intelligent Swarm NetworksXuanchi Guo, Anh Le-Tuan, Danh Le-Phuoc2024-11-28下载This paper proposes the possibility of integrating Dynamic Knowledge Graph (DKG) with Software-Defined Networking (SDN). This new approach aims to assist the management and control capabilities of the...
Machine Learning for Spectrum Sharing: A SurveyFrancisco R. V. Guimarães, José Mairton B. da Silva, Charles Casimiro Cavalcante, Gabor Fodor, Mats Bengtsson, Carlo Fischione2024-11-28下载The 5th generation (5G) of wireless systems is being deployed with the aim to provide many sets of wireless communication services, such as low data rates for a massive amount of devices, broadband, l...
ReputeStream: Mitigating Free-Riding through Reputation-Based Multi-Layer P2P Live StreamingRashmi Kushwaha, Rahul Bhattacharyya, Yatindra Nath Singh2024-11-28下载This paper presents a novel algorithm for peer-to-peer (P2P) live streaming that addresses the limitations of single-layer systems through a multi-layered approach.
Traffic-cognitive Slicing for Resource-efficient Offloading with Dual-distillation DRL in Multi-edge SystemsTing Xiaoyang, Minfeng Zhang, Saimin Chen Zhang2024-11-28下载In edge computing, emerging network slicing and computation offloading can support Edge Service Providers (ESPs) better handling diverse distributions of user requests, to improve Quality-of-Service (...
Perspectives on 6G ArchitecturesRainer Liebhart, Mansoor Shafi, Harsh Tataria, Gajan Shivanandan, Devaki Chandramouli2024-11-28下载Mobile communications have been undergoing a generational change every ten years. While 5G network deployments are maturing, significant efforts are being made to standardize 6G, which is expected to ...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
An Integrated Artificial Intelligence Operating System for Advanced Low-Altitude Aviation ApplicationsMinzhe Tan, Xinlin Fan, Jian He, Yi Hou, Zhan Liu, Yaopeng Jiang, Y. M. Jiang2024-11-28下载This paper introduces a high-performance artificial intelligence operating system tailored for low-altitude aviation, designed to address key challenges such as real-time task execution, computational...

cs.PF - Performance

标题作者发布日期PDF摘要
Unified schemes for directive-based GPU offloadingYohei Miki, Toshihiro Hanawa2024-11-28下载GPU is the dominant accelerator device due to its high performance and energy efficiency. Directive-based GPU offloading using OpenACC or OpenMP target is a convenient way to port existing codes origi...
Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation ApproachYijia Zhang, Zhihong Gou, Shijie Cao, Weigang Feng, Sicheng Zhang, Guohao Dai, Ningyi Xu2024-11-28下载Deep Neural Networks (DNNs) have revolutionized various fields, but their deployment on GPUs often leads to significant energy consumption. Unlike existing methods for reducing GPU energy consumption,...

基于 VitePress 构建