2024-10-10

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
MENAGE: Mixed-Signal Event-Driven Neuromorphic Accelerator for Edge Applications	Armin Abdollahi, Mehdi Kamal, Massoud Pedram	2024-10-10	下载	This paper presents a mixed-signal neuromorphic accelerator architecture designed for accelerating inference with event-based neural network models.
RISC-V V Vector Extension (RVV) with reduced number of vector registers	Eino Jacobs, Dmitry Utyansky, Muhammad Hassan, Thomas Roecker	2024-10-10	下载	To reduce the area of RISC-V Vector extension (RVV) in small processors, the authors are considering one simple modification: reduce the number of registers in the vector register file.
Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices	Yiwei Zhao, Ziyun Li, Win-San Khwa, Xiaoyu Sun, Sai Qian Zhang, Syed Shakib Sarwar, Kleber Hugo Stangherlin, Yi-Lun Lu, Jorge Tomas Gomez, Jae-Sun Seo, Phillip B. Gibbons, Barbara De Salvo, Chiao Liu	2024-10-10	下载	Low-Latency and Low-Power Edge AI is essential for Virtual Reality and Augmented Reality applications. Recent advances show that hybrid models, combining convolution layers (CNN) and transformers (ViT...
M $^2$ -ViT: Accelerating Hybrid Vision Transformers with Two-Level Mixed Quantization	Yanbiao Liang, Huihong Shi, Zhongfeng Wang	2024-10-10	下载	Although Vision Transformers (ViTs) have achieved significant success, their intensive computations and substantial memory overheads challenge their deployment on edge devices.
vCLIC: Towards Fast Interrupt Handling in Virtualized RISC-V Mixed-criticality Systems	Enrico Zelioli, Alessandro Ottaviano, Robert Balas, Nils Wistoff, Angelo Garofalo, Luca Benini	2024-10-10	下载	The widespread diffusion of compute-intensive edge-AI workloads and the stringent demands of modern autonomous systems require advanced heterogeneous embedded architectures.
GUST: Graph Edge-Coloring Utilization for Accelerating Sparse Matrix Vector Multiplication	Armin Gerami, Bahar Asgari	2024-10-10	下载	Sparse matrix-vector multiplication (SpMV) plays a vital role in various scientific and engineering fields, from scientific computing to machine learning.
The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building Scalable PIM Accelerators	MD Arafat Kabir, Tendayi Kamucheka, Nathaniel Fredricks, Joel Mandebi, Jason Bakos, Miaoqing Huang, David Andrews	2024-10-10	下载	Many recent FPGA-based Processor-in-Memory (PIM) architectures have appeared with promises of impressive levels of parallelism but with performance that falls short of expectations due to reduced maxi...
Reducing the Cost of Dropout in Flash-Attention by Hiding RNG with GEMM	Haiyue Ma, Jian Liu, Ronny Krashinsky	2024-10-10	下载	Dropout, a network operator, when enabled is likely to dramatically impact the performance of Flash-Attention, which in turn increases the end-to-end training time of Large-Language-Models (LLMs).

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
A Simulated Annealing Approach to Identical Parallel Machine Scheduling	Jiaxing Li, David Perkins	2024-10-10	下载	This paper studies the application of the simulated annealing metaheuristic on the identical parallel machine scheduling problem, a variant of the broader optimal job scheduling problem.
POSEIDON : Efficient Function Placement at the Edge using Deep Reinforcement Learning	Prakhar Jain, Prakhar Singhal, Divyansh Pandey, Giovanni Quattrocchi, Karthik Vaidhyanathan	2024-10-10	下载	Edge computing allows for reduced latency and operational costs compared to centralized cloud systems. In this context, serverless functions are emerging as a lightweight and effective paradigm for ma...
Agent-based modeling for realistic reproduction of human mobility and contact behavior to evaluate test and isolation strategies in epidemic infectious disease spread	David Kerkmann, Sascha Korf, Khoa Nguyen, Daniel Abele, Alain Schengen, Carlotta Gerstein, Jens Henrik Göbbert, Achim Basermann, Martin J. Kühn, Michael Meyer-Hermann	2024-10-10	下载	Agent-based models have proven to be useful tools in supporting decision-making processes in different application domains. The advent of modern computers and supercomputers has enabled these bottom-u...
NLP-Guided Synthesis: Transitioning from Sequential Programs to Distributed Programs	Arun Sanjel, Bikram Khanal, Greg Speegle, Pablo Rivas	2024-10-10	下载	As the need for large-scale data processing grows, distributed programming frameworks like PySpark have become increasingly popular. However, the task of converting traditional, sequential code to dis...
AI Surrogate Model for Distributed Computing Workloads	David K. Park, Yihui Ren, Ozgur O. Kilic, Tatiana Korchuganova, Sairam Sri Vatsavai, Joseph Boudreau, Tasnuva Chowdhury, Shengyu Feng, Raees Khan, Jaehyung Kim, Scott Klasky, Tadashi Maeno, Paul Nilsson, Verena Ingrid Martinez Outschoorn, Norbert Podhorszki, Frederic Suter, Wei Yang, Yiming Yang, Shinjae Yoo, Alexei Klimentov, Adolfy Hoisie	2024-10-10	下载	Large-scale international scientific collaborations, such as ATLAS, Belle II, CMS, and DUNE, generate vast volumes of data. These experiments necessitate substantial computational power for varied tas...
A Cloud in the Sky: Geo-Aware On-board Data Services for LEO Satellites	Thomas Sandholm, Sayandev Mukherjee, Bernardo A Huberman	2024-10-10	下载	We propose an architecture with accompanying protocol for on-board satellite data infrastructure designed for Low Earth Orbit (LEO) constellations offering communication services, such as direct-to-ce...
Exploring the Landscape of Distributed Graph Sketching	David Tench, Evan T. West, Kenny Zhang, Michael Bender, Daniel DeLayo, Martin Farach-Colton, Gilvir Gill, Tyler Seip, Victor Zhang	2024-10-10	下载	Recent work has initiated the study of dense graph processing using graph sketching methods, which drastically reduce space costs by lossily compressing information about the input graph.
SALINA: Towards Sustainable Live Sonar Analytics in Wild Ecosystems	Chi Xu, Rongsheng Qian, Hao Fang, Xiaoqiang Ma, William I. Atlas, Jiangchuan Liu, Mark A. Spoljaric	2024-10-10	下载	Sonar radar captures visual representations of underwater objects and structures using sound wave reflections, making it essential for exploration, mapping, and continuous surveillance in wild ecosyst...
Efficient Adaptive Federated Optimization	Su Hyeong Lee, Sidharth Sharma, Manzil Zaheer, Tian Li	2024-10-10	下载	Adaptive optimization is critical in federated learning, where enabling adaptivity on both the server and client sides has proven essential for achieving optimal performance.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
"It's Your Turn": A Novel Channel Contention Mechanism for Improving Wi-Fi's Reliability	Francesc Wilhelmi, Lorenzo Galati-Giordano, Gianluca Fontanesi	2024-10-10	下载	The next generation of Wi-Fi, i.e., the IEEE 802.11bn (aka Wi-Fi 8), is not only expected to increase its performance and provide extended capabilities but also aims to offer a reliable service.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices	Yiwei Zhao, Ziyun Li, Win-San Khwa, Xiaoyu Sun, Sai Qian Zhang, Syed Shakib Sarwar, Kleber Hugo Stangherlin, Yi-Lun Lu, Jorge Tomas Gomez, Jae-Sun Seo, Phillip B. Gibbons, Barbara De Salvo, Chiao Liu	2024-10-10	下载	Low-Latency and Low-Power Edge AI is essential for Virtual Reality and Augmented Reality applications. Recent advances show that hybrid models, combining convolution layers (CNN) and transformers (ViT...
Catastrophic Cyber Capabilities Benchmark (3CB): Robustly Evaluating LLM Agent Cyber Offense Capabilities	Andrey Anurin, Jonathan Ng, Kibo Schaffer, Jason Schreiber, Esben Kran	2024-10-10	下载	LLM agents have the potential to revolutionize defensive cyber operations, but their offensive capabilities are not yet fully understood. To prepare for emerging threats, model developers and governme...
Plug-and-Play Performance Estimation for LLM Services without Relying on Labeled Data	Can Wang, Dianbo Sui, Hongliang Sun, Hao Ding, Bolin Zhang, Zhiying Tu	2024-10-10	下载	Large Language Model (LLM) services exhibit impressive capability on unlearned tasks leveraging only a few examples by in-context learning (ICL).
An Analysis of XML Compression Efficiency	Christopher James Augeri, Barry E. Mullins, Leemon C. Baird, Dursun A. Bulutoglu, Rusty O. Baldwin	2024-10-10	下载	XML simplifies data exchange among heterogeneous computers, but it is notoriously verbose and has spawned the development of many XML-specific compressors and binary formats.