Appearance
2024-07-17
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization | Rui Xie, Asad Ul Haq, Linsen Ma, Krystal Sun, Sanchari Sen, Swagath Venkataramani, Liu Liu, Tong Zhang | 2024-07-17 | 下载 | Recent studies have revealed that, during the inference on generative AI models such as transformer, the importance of different weights exhibits substantial context-dependent variations. |
| CHOSEN: Compilation to Hardware Optimization Stack for Efficient Vision Transformer Inference | Mohammad Erfan Sadeghi, Arash Fayyazi, Suhas Somashekar, Armin Abdollahi, Massoud Pedram | 2024-07-17 | 下载 | Vision Transformers (ViTs) represent a groundbreaking shift in machine learning approaches to computer vision. Unlike traditional approaches, ViTs employ the self-attention mechanism, which has been w... |
| Highly Efficient Parallel Row-Layered Min-Sum MDPC Decoder for McEliece Cryptosystem | Jiaxuan Cai, Xinmiao Zhang | 2024-07-17 | 下载 | The medium-density parity-check (MDPC) code-based McEliece cryptosystem remains a finalist of the post-quantum cryptography standard. The Min-sum decoding algorithm achieves better performance-complex... |
| ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks | Salma Afifi, Ishan Thakkar, Sudeep Pasricha | 2024-07-17 | 下载 | Transformers have emerged as a powerful tool for natural language processing (NLP) and computer vision. Through the attention mechanism, these models have exhibited remarkable performance gains when c... |
| MCU-MixQ: A HW/SW Co-optimized Mixed-precision Neural Network Design Framework for MCUs | Junfeng Gong, Cheng Liu, Long Cheng, Huawei Li, Xiaowei Li | 2024-07-17 | 下载 | Mixed-precision neural network (MPNN) that utilizes just enough data width for the neural network processing is an effective approach to meet the stringent resources constraints including memory and c... |
| IICPilot: An Intelligent Integrated Circuit Backend Design Framework Using Open EDA | Zesong Jiang, Qing Zhang, Cheng Liu, Long Cheng, Huawei Li, Xiaowei Li | 2024-07-17 | 下载 | Open-source EDA tools are rapidly advancing, fostering collaboration, innovation, and knowledge sharing within the EDA community. However, the growing complexity of these tools, characterized by numer... |
| Graphitron: A Domain Specific Language for FPGA-based Graph Processing Accelerator Generation | Xinmiao Zhang, Zheng Feng, Shengwen Liang, Xinyu Chen, Cheng Liu, Huawei Li, Xiaowei Li | 2024-07-17 | 下载 | FPGA-based graph processing accelerators, enabling extensive customization, have demonstrated significant energy efficiency over general computing engines like CPUs and GPUs. |
| SigDLA: A Deep Learning Accelerator Extension for Signal Processing | Fangfa Fu, Wenyu Zhang, Zesong Jiang, Zhiyu Zhu, Guoyu Li, Bing Yang, Cheng Liu, Liyi Xiao, Jinxiang Wang, Huawei Li, Xiaowei Li | 2024-07-17 | 下载 | Deep learning and signal processing are closely correlated in many IoT scenarios such as anomaly detection to empower intelligence of things. Many IoT processors utilize digital signal processors (DSP... |
| An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation | W. A. Susantha Wijesinghe | 2024-07-17 | 下载 | This paper presents a novel algorithm for the modulus operation for FPGA implementation. The proposed algorithm use only addition, subtraction, logical, and bit shift operations, avoiding the complexi... |
| StoX-Net: Stochastic Processing of Partial Sums for Efficient In-Memory Computing DNN Accelerators | Ethan G Rogers, Sohan Salahuddin Mugdho, Kshemal Kshemendra Gupte, Cheng Wang | 2024-07-17 | 下载 | Crossbar-based in-memory computing (IMC) has emerged as a promising platform for hardware acceleration of deep neural networks (DNNs). However, the energy and latency of IMC systems are dominated by t... |
| SENTAUR: Security EnhaNced Trojan Assessment Using LLMs Against Undesirable Revisions | Jitendra Bhandari, Rajat Sadhukhan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri | 2024-07-17 | 下载 | A globally distributed IC supply chain brings risks due to untrusted third parties. The risks span inadvertent use of hardware Trojan (HT), inserted Intellectual Property (3P-IP) or Electronic Design ... |
| Chip Placement with Diffusion Models | Vint Lee, Minh Nguyen, Leena Elzeiny, Chun Deng, Pieter Abbeel, John Wawrzynek | 2024-07-17 | 下载 | Macro placement is a vital step in digital circuit design that defines the physical location of large collections of components, known as macros, on a 2D chip. |
| RTL Verification for Secure Speculation Using Contract Shadow Logic | Qinhan Tan, Yuheng Yang, Thomas Bourgeat, Sharad Malik, Mengjia Yan | 2024-07-17 | 下载 | Modern out-of-order processors face speculative execution attacks. Despite various proposed software and hardware mitigations to prevent such attacks, new attacks keep arising from unknown vulnerabili... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Proof-of-Collaborative-Learning: A Multi-winner Federated Learning Consensus Algorithm | Amirreza Sokhankhosh, Sara Rouhani | 2024-07-17 | 下载 | Regardless of their variations, blockchains require a consensus mechanism to validate transactions, supervise added blocks, maintain network security, synchronize the network state, and distribute inc... |
| Automated Gateways: A Smart Contract-Powered Solution for Interoperability Across Blockchains | Koosha Esmaeilzadeh Khorasani, Sara Rouhani, Rui Pan, Vahid Pourheidari | 2024-07-17 | 下载 | Interoperability is a significant challenge in blockchain technology, hindering seamless data and service sharing across diverse blockchain networks. |
| A Framework for testing Federated Learning algorithms using an edge-like environment | Felipe Machado Schwanck, Marcos Tomazzoli Leipnitz, Joel Luís Carbonera, Juliano Araujo Wickboldt | 2024-07-17 | 下载 | Federated Learning (FL) is a machine learning paradigm in which many clients cooperatively train a single centralized model while keeping their data private and decentralized. |
| FlexFL: Heterogeneous Federated Learning via APoZ-Guided Flexible Pruning in Uncertain Scenarios | Zekai Chen, Chentao Jia, Ming Hu, Xiaofei Xie, Anran Li, Mingsong Chen | 2024-07-17 | 下载 | Along with the increasing popularity of Deep Learning (DL) techniques, more and more Artificial Intelligence of Things (AIoT) systems are adopting federated learning (FL) to enable privacy-aware colla... |
| LSKV: A Confidential Distributed Datastore to Protect Critical Data in the Cloud | Andrew Jeffery, Julien Maffre, Heidi Howard, Richard Mortier | 2024-07-17 | 下载 | Software services are increasingly migrating to the cloud, requiring trust in actors with direct access to the hardware, software and data comprising the service. |
| Continuous reasoning for adaptive container image distribution in the cloud-edge continuum | Damiano Azzolini, Stefano Forti, Antonio Ielo | 2024-07-17 | 下载 | Cloud-edge computing requires applications to operate across diverse infrastructures, often triggered by cyber-physical events. Containers offer a lightweight deployment option but pulling images from... |
| Computing: Looking Back and Moving Forward | Muhammed Golec, Sukhpal Singh Gill | 2024-07-17 | 下载 | The Internet and computer commercialization have transformed the computing systems area over the past sixty years, affecting society. Computer systems have evolved to meet diverse social needs thanks ... |
| LLM Inference Serving: Survey of Recent Advances and Opportunities | Baolin Li, Yankai Jiang, Vijay Gadepally, Devesh Tiwari | 2024-07-17 | 下载 | This survey offers a comprehensive overview of recent advancements in Large Language Model (LLM) serving systems, focusing on research since the year 2023. |
| Mitigating Interference of Microservices with a Scoring Mechanism in Large-scale Clusters | Dingyu Yang, Kangpeng Zheng, Shiyou Qian, Jian Cao, Guangtao Xue | 2024-07-17 | 下载 | Co-locating latency-critical services (LCSs) and best-effort jobs (BEJs) constitute the principal approach for enhancing resource utilization in production. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Framework for testing Federated Learning algorithms using an edge-like environment | Felipe Machado Schwanck, Marcos Tomazzoli Leipnitz, Joel Luís Carbonera, Juliano Araujo Wickboldt | 2024-07-17 | 下载 | Federated Learning (FL) is a machine learning paradigm in which many clients cooperatively train a single centralized model while keeping their data private and decentralized. |
| A Scheduler for Real-Time Service in Wi-Fi 8 Multi-AP Networks With Parameterized Spatial Reuse | Kirill Chemrov, Dmitry Bankov, Andrey Lyakhov, Evgeny Khorov | 2024-07-17 | 下载 | Real-time applications (RTAs) require low delays and impose a significant challenge to Wi-Fi. In Wi-Fi, high delays are often caused by waiting for the channel to become idle. |
| Plausibly Deniable Content Discovery for Bitswap Using Random Walks | Manuel Wedler, Erik Daniel, Florian Tschorsch | 2024-07-17 | 下载 | Bitswap is the data exchange protocol for the content-addressed peer-to-peer overlay network IPFS. During content discovery, Bitswap reveals the interest of a peer in content to all neighbors, enablin... |
| Bayesian Optimization for Fast Radio Mapping and Localization with an Autonomous Aerial Drone | Paul S. Kudyba, Qin Lu, Haijian Sun | 2024-07-17 | 下载 | This paper explores how a flying drone can autonomously navigate while constructing a narrowband radio map for signal localization. As flying drones become more ubiquitous, their wireless signals will... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Cheddar: A Swift Fully Homomorphic Encryption Library Designed for GPU Architectures | Wonseok Choi, Jongmin Kim, Jung Ho Ahn | 2024-07-17 | 下载 | Fully homomorphic encryption (FHE) frees cloud computing from privacy concerns by enabling secure computation on encrypted data. However, its substantial computational and memory overhead results in s... |