2024-06-28

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
CIS: Composable Instruction Set for Data Streaming Applications	Yu Yang, Jordi Altayó González, Paul Delestrac, Ahmed Hemani	2024-06-28	下载	The enhanced efficiency of hardware accelerators, including Single Instruction Multiple Data (SIMD) architectures and Coarse-Grained Reconfigurable Architectures (CGRAs), is driving significant advanc...
Automated Deep Neural Network Inference Partitioning for Distributed Embedded Systems	Fabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja Harbaum, Juergen Becker	2024-06-28	下载	Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness.
Idle is the New Sleep: Configuration-Aware Alternative to Powering Off FPGA-Based DL Accelerators During Inactivity	Chao Qian, Christopher Cichiwskyj, Tianheng Ling, Gregor Schiele	2024-06-28	下载	In the rapidly evolving Internet of Things (IoT) domain, we concentrate on enhancing energy efficiency in Deep Learning accelerators on FPGA-based heterogeneous platforms, aligning with the principles...
FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models	Saeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna	2024-06-28	下载	Distributed Deep Neural Network (DNN) training is a technique to reduce the training overhead by distributing the training tasks into multiple accelerators, according to a parallelization strategy.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Improving Locality in Sparse and Dense Matrix Multiplications	Mohammad Mahdi Salehi Dezfuli, Kazem Cheshmi	2024-06-28	下载	Consecutive matrix multiplications are commonly used in graph neural networks and sparse linear solvers. These operations frequently access the same matrices for both reading and writing.
Portability Efficiency Approach for Calculating Performance Portability	Ami Marowka	2024-06-28	下载	The emergence of heterogeneity in high-performance computing, which harnesses under one integrated system several platforms of different architectures, also led to the development of innovative cross-...
Kubernetes Deployment Options for On-Prem Clusters	Lincoln Bryant, Robert W. Gardner, Fengping Hu, David Jordan, Ryan P. Taylor	2024-06-28	下载	Over the last decade, the Kubernetes container orchestration platform has become essential to many scientific workflows. Despite its popularity, deploying a production-ready Kubernetes cluster on-prem...
NetNN: Neural Intrusion Detection System in Programmable Networks	Kamran Razavi, Shayan Davari Fard, George Karlos, Vinod Nigade, Max Mühlhäuser, Lin Wang	2024-06-28	下载	The rise of deep learning has led to various successful attempts to apply deep neural networks (DNNs) for important networking tasks such as intrusion detection.
Automated Deep Neural Network Inference Partitioning for Distributed Embedded Systems	Fabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja Harbaum, Juergen Becker	2024-06-28	下载	Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness.
Runtime Instrumentation for Reactive Components (Extended Version)	Luca Aceto, Duncan Paul Attard, Adrian Francalanza, Anna Ingólfsdóttir	2024-06-28	下载	Reactive software calls for instrumentation methods that uphold the reactive attributes of systems. Runtime verification imposes another demand on the instrumentation, namely that the trace event sequ...
Parameterized Verification of Round-based Distributed Algorithms via Extended Threshold Automata	Tom Baumeister, Paul Eichler, Swen Jacobs, Mouhammad Sakr, Marcus Völp	2024-06-28	下载	Threshold automata are a computational model that has proven to be versatile in modeling threshold-based distributed algorithms and enabling their completely automatic parameterized verification.
BinomialHash: A Constant Time, Minimal Memory Consistent Hash Algorithm	Massimo Coluzzi, Amos Brocco, Alessandro Antonucci, Tiziano Leidi	2024-06-28	下载	Consistent hashing is a technique for distributing data across a network of nodes in a way that minimizes reorganization when nodes join or leave the network.
Prediction based computation offloading and resource allocation for multi-access ISAC enabled IoT system	Duc-Thuan Le	2024-06-28	下载	In the new era of the Internet of Things (IoT), tasks are now being migrated to edge sites closer to data generators. Mobile devices inherently encounter limitations in terms of energy and computation...
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management	Wonbeom Lee, Jungi Lee, Junghwan Seo, Jaewoong Sim	2024-06-28	下载	Transformer-based large language models (LLMs) demonstrate impressive performance across various natural language processing tasks. Serving LLM inference for generating long contents, however, poses a...
Personalized Interpretation on Federated Learning: A Virtual Concepts approach	Peng Yan, Guodong Long, Jing Jiang, Michael Blumenstein	2024-06-28	下载	Tackling non-IID data is an open challenge in federated learning research. Existing FL methods, including robust FL and personalized FL, are designed to improve model performance without consideration...
Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core Systems	Yufan Xia, Giuseppe Maria Junior Barca	2024-06-28	下载	BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging.
Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing	Rui Li, Tao Ouyang, Liekang Zeng, Guocheng Liao, Zhi Zhou, Xu Chen	2024-06-28	下载	Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics.
A Survey on Failure Analysis and Fault Injection in AI Systems	Guangba Yu, Gou Tan, Haojia Huang, Zhenyu Zhang, Pengfei Chen, Roberto Natella, Zibin Zheng	2024-06-28	下载	The rapid advancement of Artificial Intelligence (AI) has led to its integration into various areas, especially with Large Language Models (LLMs) significantly enhancing capabilities in Artificial Int...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Koopman based trajectory model and computation offloading for high mobility paradigm in ISAC enabled IoT system	Minh-Tuan Tran	2024-06-28	下载	User experience on mobile devices is constrained by limited battery capacity and processing power, but 6G technology advancements are diving rapidly into mobile technical evolution.
QoE-Driven Multi-Task Offloading for Semantic-Aware Edge Computing Systems	Xuyang Chen, Daquan Feng, Wei Jiang, Qu Luo, Gaojie Chen, Yao Sun	2024-06-28	下载	Mobile edge computing (MEC) provides low-latency offloading solutions for computationally intensive tasks, effectively improving the computing efficiency and battery life of mobile devices.
Prediction based computation offloading and resource allocation for multi-access ISAC enabled IoT system	Duc-Thuan Le	2024-06-28	下载	In the new era of the Internet of Things (IoT), tasks are now being migrated to edge sites closer to data generators. Mobile devices inherently encounter limitations in terms of energy and computation...
End-to-End Uplink Performance Analysis of Satellite-Based IoT Networks: A Stochastic Geometry Approach	Jiusi Zhou, Ruibo Wang, Basem Shihada, Mohamed-Slim Alouini	2024-06-28	下载	With the deployment of satellite constellations, Internet-of-Things (IoT) devices in remote areas have gained access to low-cost network connectivity.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
A parallel evolutionary algorithm to optimize dynamic memory managers in embedded systems	José L. Risco-Martín, David Atienza, J. Manuel Colmenar, Oscar Garnica	2024-06-28	下载	For the last thirty years, several Dynamic Memory Managers (DMMs) have been proposed. Such DMMs include first fit, best fit, segregated fit and buddy systems.