Skip to content

2024-11-26

cs.AR - Architecture

标题作者发布日期PDF摘要
SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative ProcessorsMariam Rakka, Jinhao Li, Guohao Dai, Ahmed Eltawil, Mohammed E. Fouda, Fadi Kurdahi2024-11-26下载Recent research efforts focus on reducing the computational and memory overheads of Large Language Models (LLMs) to make them feasible on resource-constrained devices.
RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code GenerationLakshmi Likhitha Mankali, Jitendra Bhandari, Manaar Alam, Ramesh Karri, Michail Maniatakos, Ozgur Sinanoglu, Johann Knechtel2024-11-26下载Large language models (LLMs) have demonstrated remarkable potential with code generation/completion tasks for hardware design. In fact, LLM-based hardware description language (HDL) code generation ha...
Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous DrivingJon Gutiérrez-Zaballa, Koldo Basterretxea, Javier Echanobe, Óscar Mata-Carballeira, M. Victoria Martínez2024-11-26下载The article discusses the use of low cost System-On-Module (SOM) platforms for the implementation of efficient hyperspectral imaging (HSI) processors for application in autonomous driving.
High-Performance and Scalable Fault-Tolerant Quantum Computation with Lattice Surgery on a 2.5D ArchitectureYosuke Ueno, Taku Saito, Teruo Tanimoto, Yasunari Suzuki, Yutaka Tabuchi, Shuhei Tamate, Hiroshi Nakamura2024-11-26下载Due to the high error rate of a qubit, detecting and correcting errors on it is essential for fault-tolerant quantum computing (FTQC). Among several FTQC techniques, lattice surgery (LS) using surface...
Efficient transformer adaptation for analog in-memory computing via low-rank adaptersChen Li, Elena Ferro, Corey Lammie, Manuel Le Gallo, Irem Boybat, Bipin Rajendran2024-11-26下载Analog In-Memory Computing (AIMC) offers a promising solution to the von Neumann bottleneck. However, deploying transformer models on AIMC remains challenging due to their inherent need for flexibilit...
PIM-AI: A Novel Architecture for High-Efficiency LLM InferenceCristobal Ortega, Yann Falevoz, Renaud Ayrignac2024-11-26下载Large Language Models (LLMs) have become essential in a variety of applications due to their advanced language understanding and generation capabilities.
A High Energy-Efficiency Multi-core Neuromorphic Architecture for Deep SNN TrainingMingjing Li, Huihui Zhou, Xiaofeng Xu, Zhiwei Zhong, Puli Quan, Xueke Zhu, Yanyu Lin, Wenjie Lin, Hongyu Guo, Junchao Zhang, Yunhao Ma, Wei Wang, Qingyan Meng, Zhengyu Ma, Guoqi Li, Xiaoxin Cui, Yonghong Tian2024-11-26下载There is a growing necessity for edge training to adapt to dynamically changing environment. Neuromorphic computing represents a significant pathway for high-efficiency intelligent computation in ener...
Single Event Upsets characterization of 65 nm CMOS 6T and 8T SRAM cells for ground level environmentDaniel Malagon, Gabriel Torrens, Jaume Segura, Sebastia A. Bota2024-11-26下载We present experimental results of the cross-section related to cosmic-ray irradiation at ground level for minimum-sized six-transistors (6T) and eight-transistors (8T) bit-cells SRAM memories impleme...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
A Parallel Scan Algorithm in the Tensor Core Unit ModelAnastasios Zouzias, William F. McColl2024-11-26下载We present a parallel scan (prefix sum) algorithm in the Tensor Core Unit (TCU) model of computation. The TCU model assumes that multiplication between two square matrices of constant size ss is a ba...
RankMap: Priority-Aware Multi-DNN Manager for Heterogeneous Embedded DevicesAndreas Karatzas, Dimitrios Stamoulis, Iraklis Anagnostopoulos2024-11-26下载Modern edge data centers simultaneously handle multiple Deep Neural Networks (DNNs), leading to significant challenges in workload management.
Adaptive Client Selection with Personalization for Communication Efficient Federated LearningAllan M. de Souza, Filipe Maciel, Joahannes B. D. da Costa, Luiz F. Bittencourt, Eduardo Cerqueira, Antonio A. F. Loureiro, Leandro A. Villas2024-11-26下载Federated Learning (FL) is a distributed approach to collaboratively training machine learning models. FL requires a high level of communication between the devices and a central server, thus imposing...
Rapid Distributed Fine-tuning of a Segmentation Model Onboard SatellitesMeghan Plumridge, Rasmus Maråk, Chiara Ceccobello, Pablo Gómez, Gabriele Meoni, Filip Svoboda, Nicholas D. Lane2024-11-26下载Segmentation of Earth observation (EO) satellite data is critical for natural hazard analysis and disaster response. However, processing EO data at ground stations introduces delays due to data transm...
A Cloud-based Real-time Probabilistic Remaining Useful Life (RUL) Estimation using the Sequential Monte Carlo (SMC) MethodKarthik Reddy Lyathakula, Fuh-Gwo Yuan2024-11-26下载The remaining useful life (RUL) estimation is an important metric that helps in condition-based maintenance. Damage data obtained from the diagnostics techniques are often noisy and the RUL estimated ...
APEX: An Extensible and Dynamism-Aware Simulator for Automated Parallel Execution in LLM ServingYi-Chien Lin, Woosuk Kwon, Ronald Pineda, Fanny Nina Paravecino2024-11-26下载Efficiently serving Large Language Models (LLMs) requires selecting an optimal parallel execution plan, balancing computation, memory, and communication overhead.
Evaluating the Overhead of the Performance Profiler Cloudprofiler With MooBenchShinhyung Yang, David Georg Reichelt, Wilhelm Hasselbring2024-11-26下载Performance engineering has become crucial for the cloud-native architecture. This architecture deploys multiple services, with each service representing an orchestration of containerized processes.
Joint Resource Optimization, Computation Offloading and Resource Slicing for Multi-Edge Traffic-Cognitive NetworksTing Xiaoyang, Minfeng Zhang, Shu gonglee, Saimin Chen Zhang2024-11-26下载The evolving landscape of edge computing envisions platforms operating as dynamic intermediaries between application providers and edge servers (ESs), where task offloading is coupled with payments fo...
PIM-AI: A Novel Architecture for High-Efficiency LLM InferenceCristobal Ortega, Yann Falevoz, Renaud Ayrignac2024-11-26下载Large Language Models (LLMs) have become essential in a variety of applications due to their advanced language understanding and generation capabilities.
A High Energy-Efficiency Multi-core Neuromorphic Architecture for Deep SNN TrainingMingjing Li, Huihui Zhou, Xiaofeng Xu, Zhiwei Zhong, Puli Quan, Xueke Zhu, Yanyu Lin, Wenjie Lin, Hongyu Guo, Junchao Zhang, Yunhao Ma, Wei Wang, Qingyan Meng, Zhengyu Ma, Guoqi Li, Xiaoxin Cui, Yonghong Tian2024-11-26下载There is a growing necessity for edge training to adapt to dynamically changing environment. Neuromorphic computing represents a significant pathway for high-efficiency intelligent computation in ener...
Distributed Load Balancing with Workload-Dependent Service RatesWenxin Zhang, Santiago R. Balseiro, Robert Kleinberg, Vahab Mirrokni, Balasubramanian Sivan, Bartek Wydrowski2024-11-26下载We study distributed load balancing in bipartite queueing systems where frontends route jobs to heterogeneous backends with workload-dependent service rates.
KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial RecomputationChaoyi Jiang, Lei Gao, Hossein Entezari Zarch, Murali Annavaram2024-11-26下载Inference for Large Language Models (LLMs) is computationally demanding. To reduce the cost of auto-regressive decoding, Key-Value (KV) cache is used to store intermediate activations, which significa...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Combining Threat Intelligence with IoT Scanning to Predict Cyber AttackJubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey, Aniket Abhishek Soni2024-11-26下载While the Web has become a global platform for communication, malicious actors, including hackers and hacktivist groups, often disseminate ideological content and coordinate activities through the "Da...
AI-Augmented Ethical Hacking: A Practical Examination of Manual Exploitation and Privilege Escalation in Linux EnvironmentsHaitham S. Al-Sinani, Chris J. Mitchell2024-11-26下载This study explores the application of generative AI (GenAI) within manual exploitation and privilege escalation tasks in Linux-based penetration testing environments, two areas critical to comprehens...
A Primer on AP Power Save in Wi-Fi 8: Overview, Analysis, and Open ChallengesRoger Sanchez-Vital, Andrey Belogaev, Carles Gomez, Jeroen Famaey, Eduard Garcia-Villegas2024-11-26下载Wi-Fi facilitates the Internet connectivity of billions of devices worldwide, making it an indispensable technology for modern life. Wi-Fi networks are becoming significantly denser, making energy con...
MetaGraphLoc: A Graph-based Meta-learning Scheme for Indoor Localization via Sensor FusionYaya Etiabi, Eslam Eldeeb, Mohammad Shehab, Wafa Njima, Hirley Alves, Mohamed-Slim Alouini, El Mehdi Amhoud2024-11-26下载Accurate indoor localization remains challenging due to variations in wireless signal environments and limited data availability. This paper introduces MetaGraphLoc, a novel system leveraging sensor f...

cs.PF - Performance

标题作者发布日期PDF摘要
Tracing Optimization for Performance Modeling and Regression DetectionKaveh Shahedi, Heng Li, Maxime Lamothe, Foutse Khomh2024-11-26下载Software performance modeling plays a crucial role in developing and maintaining software systems. A performance model analytically describes the relationship between the performance of a system and i...
KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial RecomputationChaoyi Jiang, Lei Gao, Hossein Entezari Zarch, Murali Annavaram2024-11-26下载Inference for Large Language Models (LLMs) is computationally demanding. To reduce the cost of auto-regressive decoding, Key-Value (KV) cache is used to store intermediate activations, which significa...

基于 VitePress 构建