Skip to content

2024-09-25

cs.AR - Architecture

标题作者发布日期PDF摘要
Benchmarking Deep Learning Models for Object Detection on Edge Computing DevicesDaghash K. Alqahtani, Aamir Cheema, Adel N. Toosi2024-09-25下载Modern applications, such as autonomous vehicles, require deploying deep learning algorithms on resource-constrained edge devices for real-time image and video processing.
PhD Forum: Efficient Privacy-Preserving Processing via Memory-Centric ComputingMpoki Mwaisela2024-09-25下载Privacy-preserving computation techniques like homomorphic encryption (HE) and secure multi-party computation (SMPC) enhance data security by enabling processing on encrypted data.
HURRY: Highly Utilized, Reconfigurable ReRAM-based In-situ Accelerator with MultifunctionalityHery Shin, Jae-Young Kim, Donghyuk Kim, Joo-Young Kim2024-09-25下载Resistive random-access memory (ReRAM) crossbar arrays are suitable for efficient inference computations in neural networks due to their analog general matrix-matrix multiplication (GEMM) capabilities...
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System InferencesPingyi Huo, Anusha Devulapally, Hasan Al Maruf, Minseo Park, Krishnakumar Nair, Meena Arunachalam, Gulsum Gudukbay Akbulut, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan2024-09-25下载Deep Learning Recommendation Models (DLRMs) have become increasingly popular and prevalent in today's datacenters, consuming most of the AI inference cycles.
Ascend HiFloat8 Format for Deep LearningYuanyong Luo, Zhongxing Zhang, Richard Wu, Hu Liu, Ying Jin, Kai Zheng, Minmin Wang, Zhanying He, Guipeng Hu, Luyao Chen, Tianchi Hu, Junsong Wang, Minqi Chen, Mikhaylov Dmitry, Korviakov Vladimir, Bobrin Maxim, Yuhao Hu, Guanfu Chen, Zeyi Huang2024-09-25下载This preliminary white paper proposes a novel 8-bit floating-point data format HiFloat8 (abbreviated as HiF8) for deep learning. HiF8 features tapered precision.
Omni 3D: BEOL-Compatible 3D Logic with Omnipresent Power, Signal, and ClockSuhyeong Choi, Carlo Gilardi, Paul Gutwin, Robert M. Radway, Tathagata Srimani, Subhasish Mitra2024-09-25下载This paper presents Omni 3D - a 3D-stacked device architecture that is naturally enabled by back-end-of-line (BEOL)-compatible transistors. Omni 3D arbitrarily interleaves metal layers for both signal...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
EfiMon: A Process Analyser for Granular Power Consumption PredictionLuis G. León-Vega, Niccolò Tosato, Stefano Cozzini2024-09-25下载High-performance computing (HPC) and supercomputing are critical in Artificial Intelligence (AI) research, development, and deployment. The extensive use of supercomputers for training complex AI mode...
High-Performance Implementation of the Optimized Event Generator for Strong-Field QED Plasma SimulationsElena Panova, Valentin Volokitin, Aleksei Bashinov, Alexander Muraviev, Evgeny Efimenko, Iosif Meyerov2024-09-25下载Numerical simulation of strong-field quantum electrodynamics (SFQED) processes is an essential step towards current and future high-intensity laser experiments.
Scalable quality control on processing of large diffusion-weighted and structural magnetic resonance imaging datasetsMichael E. Kim, Chenyu Gao, Karthik Ramadass, Praitayini Kanakaraj, Nancy R. Newlin, Gaurav Rudravaram, Kurt G. Schilling, Blake E. Dewey, David A. Bennett, Sid OBryant, Robert C. Barber, Derek Archer, Timothy J. Hohman, Shunxing Bao, Zhiyuan Li, Bennett A. Landman, Nazirah Mohd Khairi, The Alzheimers Disease Neuroimaging Initiative, The HABSHD Study Team2024-09-25下载Proper quality control (QC) is time consuming when working with large-scale medical imaging datasets, yet necessary, as poor-quality data can lead to erroneous conclusions or poorly trained machine le...
SHEATH: Defending Horizontal Collaboration for Distributed CNNs against Adversarial NoiseMuneeba Asif, Mohammad Kumail Kazmi, Mohammad Ashiqur Rahman, Syed Rafay Hasan, Soamar Homsi2024-09-25下载As edge computing and the Internet of Things (IoT) expand, horizontal collaboration (HC) emerges as a distributed data processing solution for resource-constrained devices.
No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with MedhaAmey Agrawal, Haoran Qiu, Junda Chen, Íñigo Goiri, Chaojie Zhang, Rayyan Shahid, Ramachandran Ramjee, Alexey Tumanov, Esha Choukse2024-09-25下载Deploying million-token Large Language Models (LLMs) is challenging because production workloads are highly heterogeneous, mixing short queries and long documents.
Syndeo: Portable Ray Clusters with Secure ContainerizationWilliam Li, Rodney S. Lafuente Mercado, Jaime D. Pena, Ross E. Allen2024-09-25下载We present Syndeo: a software framework for container orchestration of Ray on Slurm. In general the idea behind Syndeo is to write code once and deploy anywhere.
Running Cloud-native Workloads on HPC with High-Performance KubernetesAntony Chazapis, Evangelos Maliaroudakis, Fotis Nikolaidis, Manolis Marazakis, Angelos Bilas2024-09-25下载The escalating complexity of applications and services encourages a shift towards higher-level data processing pipelines that integrate both Cloud-native and HPC steps into the same workflow.
Benchmarking Deep Learning Models for Object Detection on Edge Computing DevicesDaghash K. Alqahtani, Aamir Cheema, Adel N. Toosi2024-09-25下载Modern applications, such as autonomous vehicles, require deploying deep learning algorithms on resource-constrained edge devices for real-time image and video processing.
miniLB: A Performance Portability Study of Lattice-Boltzmann SimulationsLuigi Crisci, Biagio Cosenza, Giorgio Amati, Matteo Turisini2024-09-25下载The Lattice Boltzmann Method (LBM) is a computational technique of Computational Fluid Dynamics (CFD) that has gained popularity due to its high parallelism and ability to handle complex geometries wi...
PhD Forum: Efficient Privacy-Preserving Processing via Memory-Centric ComputingMpoki Mwaisela2024-09-25下载Privacy-preserving computation techniques like homomorphic encryption (HE) and secure multi-party computation (SMPC) enhance data security by enabling processing on encrypted data.
PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System InferencesPingyi Huo, Anusha Devulapally, Hasan Al Maruf, Minseo Park, Krishnakumar Nair, Meena Arunachalam, Gulsum Gudukbay Akbulut, Mahmut Taylan Kandemir, Vijaykrishnan Narayanan2024-09-25下载Deep Learning Recommendation Models (DLRMs) have become increasingly popular and prevalent in today's datacenters, consuming most of the AI inference cycles.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
A Hierarchical Gradient Tracking Algorithm for Mitigating Subnet-Drift in Fog Learning NetworksEvan Chen, Shiqiang Wang, Christopher G. Brinton2024-09-25下载Federated learning (FL) encounters scalability challenges when implemented over fog networks that do not follow FL's conventional star topology architecture.
Evaluation of Spectrum Sharing Algorithms for Networks with Heterogeneous Wireless DevicesAnkit Walishetti, Igor Kadota, Aidan Kim, Colin Ward, Eduardo Gutierrez, Randall Berry2024-09-25下载As highlighted in the National Spectrum Strategy, Dynamic Spectrum Access (DSA) is key for enabling 6G networks to meet the increasing demand for spectrum from various, heterogeneous emerging applicat...
Learning with Dynamics: Autonomous Regulation of UAV Based Communication Networks with Dynamic UAV CrewRan Zhang, Bowei Li, Liyuan Zhang, Jiang, Xie, Miao Wang2024-09-25下载Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) are a key component in future mobile networking. To handle the dynamic environments in UCNs, reinforcement learning (RL) has been a pr...
NetScaNDN: A Scalable and Flexible Testbed To Evaluate NDN on Multiple InfrastructuresAmir Esmaeili, Maryam Fazli2024-09-25下载The evolution from traditional IP-based networking to Named Data Networking (NDN) represents a paradigm shift to address the inherent limitations of current network architectures, such as scalability,...
Predictive Covert Communication Against Multi-UAV Surveillance Using Graph Koopman AutoencoderSivaram Krishnan, Jihong Park, Gregory Sherman, Benjamin Campbell, Jinho Choi2024-09-25下载Low Probability of Detection (LPD) communication aims to obscure the presence of radio frequency (RF) signals to evade surveillance. In the context of mobile surveillance utilizing unmanned aerial veh...
Space-Based Quantum Internet: Entanglement Distribution in Time-Varying LEO ConstellationsSeid Koudia, Junaid ur Rehman, Symeon Chatzinotas2024-09-25下载This paper addresses the complexities of entanglement distribution in LEO satellite networks, particularly those arising from their dynamic topology.
Bridge to Real Environment with Hardware-in-the-loop for Wireless Artificial Intelligence ParadigmsJeffrey Redondo, Nauman Aslam, Juan Zhang, Zhenhui Yuan2024-09-25下载Nowadays, many machine learning (ML) solutions to improve the wireless standard IEEE802.11p for Vehicular Adhoc Network (VANET) are commonly evaluated in the simulated world.
Asynchronous Fractional Multi-Agent Deep Reinforcement Learning for Age-Minimal Mobile Edge ComputingLyudong Jin, Ming Tang, Jiayu Pan, Meng Zhang, Hao Wang2024-09-25下载In the realm of emerging real-time networked applications such as cyber-physical systems (CPS), the Age of Information (AoI) has emerged as a pivotal metric for evaluating timeliness.
Joint Mobile IAB Node Positioning and Scheduler Selection in Locations With Substantial ObstaclesPaulo Furtado Correia, Andre Coelho, Manuel Ricardo2024-09-25下载Integrated Access and Backhaul (IAB) in cellular networks combines access and backhaul within a wireless infrastructure reducing reliance on fibre-based backhaul.
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise RatiosMarko Tuononen, Dani Korpi, Ville Hautamäki2024-09-25下载We propose a novel method for interpreting neural networks, focusing on convolutional neural network-based receiver model. The method identifies which unit or units of the model contain most (or least...
xDevSM: Streamlining xApp Development With a Flexible Framework for O-RAN E2 Service ModelsAngelo Feraudo, Stefano Maxenti, Andrea Lacava, Paolo Bellavista, Michele Polese, Tommaso Melodia2024-09-25下载RAN Intelligent Controllers (RICs) are programmable platforms that enable data-driven closed-loop control in the O-RAN architecture. They collect telemetry and data from the RAN, process it in custom ...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
FusionANNS: An Efficient CPU/GPU Cooperative Processing Architecture for Billion-scale Approximate Nearest Neighbor SearchBing Tian, Haikun Liu, Yuhang Tang, Shihai Xiao, Zhuohui Duan, Xiaofei Liao, Xuecang Zhang, Junhua Zhu, Yu Zhang2024-09-25下载Approximate nearest neighbor search (ANNS) has emerged as a crucial component of database and AI infrastructure. Ever-increasing vector datasets pose significant challenges in terms of performance, co...

cs.PF - Performance

标题作者发布日期PDF摘要
Results of the Big ANN: NeurIPS'23 competitionHarsha Vardhan Simhadri, Martin Aumüller, Amir Ingber, Matthijs Douze, George Williams, Magdalen Dobson Manohar, Dmitry Baranchuk, Edo Liberty, Frank Liu, Ben Landrum, Mazin Karjikar, Laxman Dhulipala, Meng Chen, Yue Chen, Rui Ma, Kai Zhang, Yuzheng Cai, Jiayang Shi, Yizhuo Chen, Weiguo Zheng, Zihao Wan, Jie Yin, Ben Huang2024-09-25下载The 2023 Big ANN Challenge, held at NeurIPS 2023, focused on advancing the state-of-the-art in indexing data structures and search algorithms for practical variants of Approximate Nearest Neighbor (AN...
VectorSearch: Enhancing Document Retrieval with Semantic Embeddings and Optimized SearchSolmaz Seyed Monir, Irene Lau, Shubing Yang, Dongfang Zhao2024-09-25下载Traditional retrieval methods have been essential for assessing document similarity but struggle with capturing semantic nuances. Despite advancements in latent semantic analysis (LSA) and deep learni...
EfiMon: A Process Analyser for Granular Power Consumption PredictionLuis G. León-Vega, Niccolò Tosato, Stefano Cozzini2024-09-25下载High-performance computing (HPC) and supercomputing are critical in Artificial Intelligence (AI) research, development, and deployment. The extensive use of supercomputers for training complex AI mode...

基于 VitePress 构建