Skip to content

2024-06-28

cs.AR - Architecture

标题作者发布日期PDF摘要
CIS: Composable Instruction Set for Data Streaming ApplicationsYu Yang, Jordi Altayó González, Paul Delestrac, Ahmed Hemani2024-06-28下载The enhanced efficiency of hardware accelerators, including Single Instruction Multiple Data (SIMD) architectures and Coarse-Grained Reconfigurable Architectures (CGRAs), is driving significant advanc...
Automated Deep Neural Network Inference Partitioning for Distributed Embedded SystemsFabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja Harbaum, Juergen Becker2024-06-28下载Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness.
Idle is the New Sleep: Configuration-Aware Alternative to Powering Off FPGA-Based DL Accelerators During InactivityChao Qian, Christopher Cichiwskyj, Tianheng Ling, Gregor Schiele2024-06-28下载In the rapidly evolving Internet of Things (IoT) domain, we concentrate on enhancing energy efficiency in Deep Learning accelerators on FPGA-based heterogeneous platforms, aligning with the principles...
FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN ModelsSaeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna2024-06-28下载Distributed Deep Neural Network (DNN) training is a technique to reduce the training overhead by distributing the training tasks into multiple accelerators, according to a parallelization strategy.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Improving Locality in Sparse and Dense Matrix MultiplicationsMohammad Mahdi Salehi Dezfuli, Kazem Cheshmi2024-06-28下载Consecutive matrix multiplications are commonly used in graph neural networks and sparse linear solvers. These operations frequently access the same matrices for both reading and writing.
Portability Efficiency Approach for Calculating Performance PortabilityAmi Marowka2024-06-28下载The emergence of heterogeneity in high-performance computing, which harnesses under one integrated system several platforms of different architectures, also led to the development of innovative cross-...
Kubernetes Deployment Options for On-Prem ClustersLincoln Bryant, Robert W. Gardner, Fengping Hu, David Jordan, Ryan P. Taylor2024-06-28下载Over the last decade, the Kubernetes container orchestration platform has become essential to many scientific workflows. Despite its popularity, deploying a production-ready Kubernetes cluster on-prem...
NetNN: Neural Intrusion Detection System in Programmable NetworksKamran Razavi, Shayan Davari Fard, George Karlos, Vinod Nigade, Max Mühlhäuser, Lin Wang2024-06-28下载The rise of deep learning has led to various successful attempts to apply deep neural networks (DNNs) for important networking tasks such as intrusion detection.
Automated Deep Neural Network Inference Partitioning for Distributed Embedded SystemsFabian Kreß, El Mahdi El Annabi, Tim Hotfilter, Julian Hoefer, Tanja Harbaum, Juergen Becker2024-06-28下载Distributed systems can be found in various applications, e.g., in robotics or autonomous driving, to achieve higher flexibility and robustness.
Runtime Instrumentation for Reactive Components (Extended Version)Luca Aceto, Duncan Paul Attard, Adrian Francalanza, Anna Ingólfsdóttir2024-06-28下载Reactive software calls for instrumentation methods that uphold the reactive attributes of systems. Runtime verification imposes another demand on the instrumentation, namely that the trace event sequ...
Parameterized Verification of Round-based Distributed Algorithms via Extended Threshold AutomataTom Baumeister, Paul Eichler, Swen Jacobs, Mouhammad Sakr, Marcus Völp2024-06-28下载Threshold automata are a computational model that has proven to be versatile in modeling threshold-based distributed algorithms and enabling their completely automatic parameterized verification.
BinomialHash: A Constant Time, Minimal Memory Consistent Hash AlgorithmMassimo Coluzzi, Amos Brocco, Alessandro Antonucci, Tiziano Leidi2024-06-28下载Consistent hashing is a technique for distributing data across a network of nodes in a way that minimizes reorganization when nodes join or leave the network.
Prediction based computation offloading and resource allocation for multi-access ISAC enabled IoT systemDuc-Thuan Le2024-06-28下载In the new era of the Internet of Things (IoT), tasks are now being migrated to edge sites closer to data generators. Mobile devices inherently encounter limitations in terms of energy and computation...
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache ManagementWonbeom Lee, Jungi Lee, Junghwan Seo, Jaewoong Sim2024-06-28下载Transformer-based large language models (LLMs) demonstrate impressive performance across various natural language processing tasks. Serving LLM inference for generating long contents, however, poses a...
Personalized Interpretation on Federated Learning: A Virtual Concepts approachPeng Yan, Guodong Long, Jing Jiang, Michael Blumenstein2024-06-28下载Tackling non-IID data is an open challenge in federated learning research. Existing FL methods, including robust FL and personalized FL, are designed to improve model performance without consideration...
Machine-Learning-Driven Runtime Optimization of BLAS Level 3 on Modern Multi-Core SystemsYufan Xia, Giuseppe Maria Junior Barca2024-06-28下载BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging.
Online Optimization of DNN Inference Network Utility in Collaborative Edge ComputingRui Li, Tao Ouyang, Liekang Zeng, Guocheng Liao, Zhi Zhou, Xu Chen2024-06-28下载Collaborative Edge Computing (CEC) is an emerging paradigm that collaborates heterogeneous edge devices as a resource pool to compute DNN inference tasks in proximity such as edge video analytics.
A Survey on Failure Analysis and Fault Injection in AI SystemsGuangba Yu, Gou Tan, Haojia Huang, Zhenyu Zhang, Pengfei Chen, Roberto Natella, Zibin Zheng2024-06-28下载The rapid advancement of Artificial Intelligence (AI) has led to its integration into various areas, especially with Large Language Models (LLMs) significantly enhancing capabilities in Artificial Int...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Koopman based trajectory model and computation offloading for high mobility paradigm in ISAC enabled IoT systemMinh-Tuan Tran2024-06-28下载User experience on mobile devices is constrained by limited battery capacity and processing power, but 6G technology advancements are diving rapidly into mobile technical evolution.
QoE-Driven Multi-Task Offloading for Semantic-Aware Edge Computing SystemsXuyang Chen, Daquan Feng, Wei Jiang, Qu Luo, Gaojie Chen, Yao Sun2024-06-28下载Mobile edge computing (MEC) provides low-latency offloading solutions for computationally intensive tasks, effectively improving the computing efficiency and battery life of mobile devices.
Prediction based computation offloading and resource allocation for multi-access ISAC enabled IoT systemDuc-Thuan Le2024-06-28下载In the new era of the Internet of Things (IoT), tasks are now being migrated to edge sites closer to data generators. Mobile devices inherently encounter limitations in terms of energy and computation...
End-to-End Uplink Performance Analysis of Satellite-Based IoT Networks: A Stochastic Geometry ApproachJiusi Zhou, Ruibo Wang, Basem Shihada, Mohamed-Slim Alouini2024-06-28下载With the deployment of satellite constellations, Internet-of-Things (IoT) devices in remote areas have gained access to low-cost network connectivity.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
A parallel evolutionary algorithm to optimize dynamic memory managers in embedded systemsJosé L. Risco-Martín, David Atienza, J. Manuel Colmenar, Oscar Garnica2024-06-28下载For the last thirty years, several Dynamic Memory Managers (DMMs) have been proposed. Such DMMs include first fit, best fit, segregated fit and buddy systems.

基于 VitePress 构建