Appearance
2024-06-28
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CIS: Composable Instruction Set for Data Streaming Applications | Yu Yang, Jordi Altayó González, Paul Delestrac, Ahmed Hemani | 2024-06-28 | 下载 | The enhanced efficiency of hardware accelerators, including Single Instruction Multiple Data (SIMD) architectures and Coarse-Grained Reconfigurable Architectures (CGRAs), is driving significant advanc... |
| Automated Deep Neural Network Inference Partitioning for Distributed Embedded Systems | Fabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja Harbaum, Juergen Becker | 2024-06-28 | 下载 | Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness. |
| Idle is the New Sleep: Configuration-Aware Alternative to Powering Off FPGA-Based DL Accelerators During Inactivity | Chao Qian, Christopher Cichiwskyj, Tianheng Ling, Gregor Schiele | 2024-06-28 | 下载 | In the rapidly evolving Internet of Things (IoT) domain, we concentrate on enhancing energy efficiency in Deep Learning accelerators on FPGA-based heterogeneous platforms, aligning with the principles... |
| FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models | Saeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna | 2024-06-28 | 下载 | Distributed Deep Neural Network (DNN) training is a technique to reduce the training overhead by distributing the training tasks into multiple accelerators, according to a parallelization strategy. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Improving Locality in Sparse and Dense Matrix Multiplications | Mohammad Mahdi Salehi Dezfuli, Kazem Cheshmi | 2024-06-28 | 下载 | Consecutive matrix multiplications are commonly used in graph neural networks and sparse linear solvers. These operations frequently access the same matrices for both reading and writing. |
| Portability Efficiency Approach for Calculating Performance Portability | Ami Marowka | 2024-06-28 | 下载 | The emergence of heterogeneity in high-performance computing, which harnesses under one integrated system several platforms of different architectures, also led to the development of innovative cross-... |
| Kubernetes Deployment Options for On-Prem Clusters | Lincoln Bryant, Robert W. Gardner, Fengping Hu, David Jordan, Ryan P. Taylor | 2024-06-28 | 下载 | Over the last decade, the Kubernetes container orchestration platform has become essential to many scientific workflows. Despite its popularity, deploying a production-ready Kubernetes cluster on-prem... |
| NetNN: Neural Intrusion Detection System in Programmable Networks | Kamran Razavi, Shayan Davari Fard, George Karlos, Vinod Nigade, Max Mühlhäuser, Lin Wang | 2024-06-28 | 下载 | The rise of deep learning has led to various successful attempts to apply deep neural networks (DNNs) for important networking tasks such as intrusion detection. |
| Automated Deep Neural Network Inference Partitioning for Distributed Embedded Systems | Fabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja Harbaum, Juergen Becker | 2024-06-28 | 下载 | Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness. |
| Runtime Instrumentation for Reactive Components (Extended Version) | Luca Aceto, Duncan Paul Attard, Adrian Francalanza, Anna Ingólfsdóttir | 2024-06-28 | 下载 | Reactive software calls for instrumentation methods that uphold the reactive attributes of systems. Runtime verification imposes another demand on the instrumentation, namely that the trace event sequ... |
| Parameterized Verification of Round-based Distributed Algorithms via Extended Threshold Automata | Tom Baumeister, Paul Eichler, Swen Jacobs, Mouhammad Sakr, Marcus Völp | 2024-06-28 | 下载 | Threshold automata are a computational model that has proven to be versatile in modeling threshold-based distributed algorithms and enabling their completely automatic parameterized verification. |
| BinomialHash: A Constant Time, Minimal Memory Consistent Hash Algorithm | Massimo Coluzzi, Amos Brocco, Alessandro Antonucci, Tiziano Leidi | 2024-06-28 | 下载 | Consistent hashing is a technique for distributing data across a network of nodes in a way that minimizes reorganization when nodes join or leave the network. |
| Prediction based computation offloading and resource allocation for multi-access ISAC enabled IoT system | Duc-Thuan Le | 2024-06-28 | 下载 | In the new era of the Internet of Things (IoT), tasks are now being migrated to edge sites closer to data generators. Mobile devices inherently encounter limitations in terms of energy and computation... |
| InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management | Wonbeom Lee, Jungi Lee, Junghwan Seo, Jaewoong Sim | 2024-06-28 | 下载 | Transformer-based large language models (LLMs) demonstrate impressive performance across various natural language processing tasks. Serving LLM inference for generating long contents, however, poses a... |
| Personalized Interpretation on Federated Learning: A Virtual Concepts approach | Peng Yan, Guodong Long, Jing Jiang, Michael Blumenstein | 2024-06-28 | 下载 | Tackling non-IID data is an open challenge in federated learning research. Existing FL methods, including robust FL and personalized FL, are designed to improve model performance without consideration... |
| Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core Systems | Yufan Xia, Giuseppe Maria Junior Barca | 2024-06-28 | 下载 | BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging. |
| Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing | Rui Li, Tao Ouyang, Liekang Zeng, Guocheng Liao, Zhi Zhou, Xu Chen | 2024-06-28 | 下载 | Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics. |
| A Survey on Failure Analysis and Fault Injection in AI Systems | Guangba Yu, Gou Tan, Haojia Huang, Zhenyu Zhang, Pengfei Chen, Roberto Natella, Zibin Zheng | 2024-06-28 | 下载 | The rapid advancement of Artificial Intelligence (AI) has led to its integration into various areas, especially with Large Language Models (LLMs) significantly enhancing capabilities in Artificial Int... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Koopman based trajectory model and computation offloading for high mobility paradigm in ISAC enabled IoT system | Minh-Tuan Tran | 2024-06-28 | 下载 | User experience on mobile devices is constrained by limited battery capacity and processing power, but 6G technology advancements are diving rapidly into mobile technical evolution. |
| QoE-Driven Multi-Task Offloading for Semantic-Aware Edge Computing Systems | Xuyang Chen, Daquan Feng, Wei Jiang, Qu Luo, Gaojie Chen, Yao Sun | 2024-06-28 | 下载 | Mobile edge computing (MEC) provides low-latency offloading solutions for computationally intensive tasks, effectively improving the computing efficiency and battery life of mobile devices. |
| Prediction based computation offloading and resource allocation for multi-access ISAC enabled IoT system | Duc-Thuan Le | 2024-06-28 | 下载 | In the new era of the Internet of Things (IoT), tasks are now being migrated to edge sites closer to data generators. Mobile devices inherently encounter limitations in terms of energy and computation... |
| End-to-End Uplink Performance Analysis of Satellite-Based IoT Networks: A Stochastic Geometry Approach | Jiusi Zhou, Ruibo Wang, Basem Shihada, Mohamed-Slim Alouini | 2024-06-28 | 下载 | With the deployment of satellite constellations, Internet-of-Things (IoT) devices in remote areas have gained access to low-cost network connectivity. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A parallel evolutionary algorithm to optimize dynamic memory managers in embedded systems | José L. Risco-Martín, David Atienza, J. Manuel Colmenar, Oscar Garnica | 2024-06-28 | 下载 | For the last thirty years, several Dynamic Memory Managers (DMMs) have been proposed. Such DMMs include first fit, best fit, segregated fit and buddy systems. |