2024-11-26

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors	Mariam Rakka, Jinhao Li, Guohao Dai, Ahmed Eltawil, Mohammed E. Fouda, Fadi Kurdahi	2024-11-26	下载	Recent research efforts focus on reducing the computational and memory overheads of Large Language Models (LLMs) to make them feasible on resource-constrained devices.
RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation	Lakshmi Likhitha Mankali, Jitendra Bhandari, Manaar Alam, Ramesh Karri, Michail Maniatakos, Ozgur Sinanoglu, Johann Knechtel	2024-11-26	下载	Large language models (LLMs) have demonstrated remarkable potential with code generation/completion tasks for hardware design. In fact, LLM-based hardware description language (HDL) code generation ha...
Rapid Deployment of Domain-specific Hyperspectral Image Processors with Application to Autonomous Driving	Jon Gutiérrez-Zaballa, Koldo Basterretxea, Javier Echanobe, Óscar Mata-Carballeira, M. Victoria Martínez	2024-11-26	下载	The article discusses the use of low cost System-On-Module (SOM) platforms for the implementation of efficient hyperspectral imaging (HSI) processors for application in autonomous driving.
High-Performance and Scalable Fault-Tolerant Quantum Computation with Lattice Surgery on a 2.5D Architecture	Yosuke Ueno, Taku Saito, Teruo Tanimoto, Yasunari Suzuki, Yutaka Tabuchi, Shuhei Tamate, Hiroshi Nakamura	2024-11-26	下载	Due to the high error rate of a qubit, detecting and correcting errors on it is essential for fault-tolerant quantum computing (FTQC). Among several FTQC techniques, lattice surgery (LS) using surface...
Efficient transformer adaptation for analog in-memory computing via low-rank adapters	Chen Li, Elena Ferro, Corey Lammie, Manuel Le Gallo, Irem Boybat, Bipin Rajendran	2024-11-26	下载	Analog In-Memory Computing (AIMC) offers a promising solution to the von Neumann bottleneck. However, deploying transformer models on AIMC remains challenging due to their inherent need for flexibilit...
PIM-AI: A Novel Architecture for High-Efficiency LLM Inference	Cristobal Ortega, Yann Falevoz, Renaud Ayrignac	2024-11-26	下载	Large Language Models (LLMs) have become essential in a variety of applications due to their advanced language understanding and generation capabilities.
A High Energy-Efficiency Multi-core Neuromorphic Architecture for Deep SNN Training	Mingjing Li, Huihui Zhou, Xiaofeng Xu, Zhiwei Zhong, Puli Quan, Xueke Zhu, Yanyu Lin, Wenjie Lin, Hongyu Guo, Junchao Zhang, Yunhao Ma, Wei Wang, Qingyan Meng, Zhengyu Ma, Guoqi Li, Xiaoxin Cui, Yonghong Tian	2024-11-26	下载	There is a growing necessity for edge training to adapt to dynamically changing environment. Neuromorphic computing represents a significant pathway for high-efficiency intelligent computation in ener...
Single Event Upsets characterization of 65 nm CMOS 6T and 8T SRAM cells for ground level environment	Daniel Malagon, Gabriel Torrens, Jaume Segura, Sebastia A. Bota	2024-11-26	下载	We present experimental results of the cross-section related to cosmic-ray irradiation at ground level for minimum-sized six-transistors (6T) and eight-transistors (8T) bit-cells SRAM memories impleme...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
A Parallel Scan Algorithm in the Tensor Core Unit Model	Anastasios Zouzias, William F. McColl	2024-11-26	下载	We present a parallel scan (prefix sum) algorithm in the Tensor Core Unit (TCU) model of computation. The TCU model assumes that multiplication between two square matrices of constant size $s$ is a ba...
RankMap: Priority-Aware Multi-DNN Manager for Heterogeneous Embedded Devices	Andreas Karatzas, Dimitrios Stamoulis, Iraklis Anagnostopoulos	2024-11-26	下载	Modern edge data centers simultaneously handle multiple Deep Neural Networks (DNNs), leading to significant challenges in workload management.
Adaptive Client Selection with Personalization for Communication Efficient Federated Learning	Allan M. de Souza, Filipe Maciel, Joahannes B. D. da Costa, Luiz F. Bittencourt, Eduardo Cerqueira, Antonio A. F. Loureiro, Leandro A. Villas	2024-11-26	下载	Federated Learning (FL) is a distributed approach to collaboratively training machine learning models. FL requires a high level of communication between the devices and a central server, thus imposing...
Rapid Distributed Fine-tuning of a Segmentation Model Onboard Satellites	Meghan Plumridge, Rasmus Maråk, Chiara Ceccobello, Pablo Gómez, Gabriele Meoni, Filip Svoboda, Nicholas D. Lane	2024-11-26	下载	Segmentation of Earth observation (EO) satellite data is critical for natural hazard analysis and disaster response. However, processing EO data at ground stations introduces delays due to data transm...
A Cloud-based Real-time Probabilistic Remaining Useful Life (RUL) Estimation using the Sequential Monte Carlo (SMC) Method	Karthik Reddy Lyathakula, Fuh-Gwo Yuan	2024-11-26	下载	The remaining useful life (RUL) estimation is an important metric that helps in condition-based maintenance. Damage data obtained from the diagnostics techniques are often noisy and the RUL estimated ...
APEX: An Extensible and Dynamism-Aware Simulator for Automated Parallel Execution in LLM Serving	Yi-Chien Lin, Woosuk Kwon, Ronald Pineda, Fanny Nina Paravecino	2024-11-26	下载	Efficiently serving Large Language Models (LLMs) requires selecting an optimal parallel execution plan, balancing computation, memory, and communication overhead.
Evaluating the Overhead of the Performance Profiler Cloudprofiler With MooBench	Shinhyung Yang, David Georg Reichelt, Wilhelm Hasselbring	2024-11-26	下载	Performance engineering has become crucial for the cloud-native architecture. This architecture deploys multiple services, with each service representing an orchestration of containerized processes.
Joint Resource Optimization, Computation Offloading and Resource Slicing for Multi-Edge Traffic-Cognitive Networks	Ting Xiaoyang, Minfeng Zhang, Shu gonglee, Saimin Chen Zhang	2024-11-26	下载	The evolving landscape of edge computing envisions platforms operating as dynamic intermediaries between application providers and edge servers (ESs), where task offloading is coupled with payments fo...
PIM-AI: A Novel Architecture for High-Efficiency LLM Inference	Cristobal Ortega, Yann Falevoz, Renaud Ayrignac	2024-11-26	下载	Large Language Models (LLMs) have become essential in a variety of applications due to their advanced language understanding and generation capabilities.
A High Energy-Efficiency Multi-core Neuromorphic Architecture for Deep SNN Training	Mingjing Li, Huihui Zhou, Xiaofeng Xu, Zhiwei Zhong, Puli Quan, Xueke Zhu, Yanyu Lin, Wenjie Lin, Hongyu Guo, Junchao Zhang, Yunhao Ma, Wei Wang, Qingyan Meng, Zhengyu Ma, Guoqi Li, Xiaoxin Cui, Yonghong Tian	2024-11-26	下载	There is a growing necessity for edge training to adapt to dynamically changing environment. Neuromorphic computing represents a significant pathway for high-efficiency intelligent computation in ener...
Distributed Load Balancing with Workload-Dependent Service Rates	Wenxin Zhang, Santiago R. Balseiro, Robert Kleinberg, Vahab Mirrokni, Balasubramanian Sivan, Bartek Wydrowski	2024-11-26	下载	We study distributed load balancing in bipartite queueing systems where frontends route jobs to heterogeneous backends with workload-dependent service rates.
KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation	Chaoyi Jiang, Lei Gao, Hossein Entezari Zarch, Murali Annavaram	2024-11-26	下载	Inference for Large Language Models (LLMs) is computationally demanding. To reduce the cost of auto-regressive decoding, Key-Value (KV) cache is used to store intermediate activations, which significa...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Combining Threat Intelligence with IoT Scanning to Predict Cyber Attack	Jubin Abhishek Soni, Amit Anand, Rajesh Kumar Pandey, Aniket Abhishek Soni	2024-11-26	下载	While the Web has become a global platform for communication, malicious actors, including hackers and hacktivist groups, often disseminate ideological content and coordinate activities through the "Da...
AI-Augmented Ethical Hacking: A Practical Examination of Manual Exploitation and Privilege Escalation in Linux Environments	Haitham S. Al-Sinani, Chris J. Mitchell	2024-11-26	下载	This study explores the application of generative AI (GenAI) within manual exploitation and privilege escalation tasks in Linux-based penetration testing environments, two areas critical to comprehens...
A Primer on AP Power Save in Wi-Fi 8: Overview, Analysis, and Open Challenges	Roger Sanchez-Vital, Andrey Belogaev, Carles Gomez, Jeroen Famaey, Eduard Garcia-Villegas	2024-11-26	下载	Wi-Fi facilitates the Internet connectivity of billions of devices worldwide, making it an indispensable technology for modern life. Wi-Fi networks are becoming significantly denser, making energy con...
MetaGraphLoc: A Graph-based Meta-learning Scheme for Indoor Localization via Sensor Fusion	Yaya Etiabi, Eslam Eldeeb, Mohammad Shehab, Wafa Njima, Hirley Alves, Mohamed-Slim Alouini, El Mehdi Amhoud	2024-11-26	下载	Accurate indoor localization remains challenging due to variations in wireless signal environments and limited data availability. This paper introduces MetaGraphLoc, a novel system leveraging sensor f...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Tracing Optimization for Performance Modeling and Regression Detection	Kaveh Shahedi, Heng Li, Maxime Lamothe, Foutse Khomh	2024-11-26	下载	Software performance modeling plays a crucial role in developing and maintaining software systems. A performance model analytically describes the relationship between the performance of a system and i...
KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation	Chaoyi Jiang, Lei Gao, Hossein Entezari Zarch, Murali Annavaram	2024-11-26	下载	Inference for Large Language Models (LLMs) is computationally demanding. To reduce the cost of auto-regressive decoding, Key-Value (KV) cache is used to store intermediate activations, which significa...