Skip to content

2024-09-30

cs.AR - Architecture

标题作者发布日期PDF摘要
Accelerating PoT Quantization on Edge DevicesRappy Saha, Jude Haris, José Cano2024-09-30下载Non-uniform quantization, such as power-of-two (PoT) quantization, matches data distributions better than uniform quantization, which reduces the quantization error of Deep Neural Networks (DNNs).
SAMIPS: A Synthesised Asynchronous ProcessorQianyi Zhang, Georgios Theodoropoulos2024-09-30下载Miniaturisation and ever increasing clock speeds pose significant challenges to synchronous VLSI design with clock distribution becoming an increasingly costly and complicated issue and power consumpt...
FastFlow in FPGA Stacks of Data CentersRourab Paul, Alberto Ottimo, Marco Danelutto2024-09-30下载FPGA programming is more complex as compared to Central Processing Units (CPUs) and Graphics Processing Units (GPUs). The coding languages to define the abstraction of Register Transfer Level (RTL) in...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Comprehensive Performance Modeling and System Design Insights for Foundation ModelsShashank Subramanian, Ermal Rrapaj, Peter Harrington, Smeet Chheda, Steven Farrell, Brian Austin, Samuel Williams, Nicholas Wright, Wahid Bhimji2024-09-30下载Generative AI, in particular large transformer models, are increasingly driving HPC system design in science and industry. We analyze performance characteristics of such transformer models and discuss...
Fisher Information-based Efficient Curriculum Federated Learning with Large Language ModelsJi Liu, Jiaxiang Ren, Ruoming Jin, Zijie Zhang, Yang Zhou, Patrick Valduriez, Dejing Dou2024-09-30下载As a promising paradigm to collaboratively train models with decentralized data, Federated Learning (FL) can be exploited to fine-tune Large Language Models (LLMs).
Intel(R) SHMEM: GPU-initiated OpenSHMEM using SYCLAlex Brooks, Philip Marshall, David Ozog, Md. Wasi-ur- Rahman, Lawrence Stewart, Rithwik Tom2024-09-30下载Modern high-end systems are increasingly becoming heterogeneous, providing users options to use general purpose Graphics Processing Units (GPU) and other accelerators for additional performance.
Resource Allocation for Stable LLM Training in Mobile Edge ComputingChang Liu, Jun Zhao2024-09-30下载As mobile devices increasingly become focal points for advanced applications, edge computing presents a viable solution to their inherent computational limitations, particularly in deploying large lan...
Pragma driven shared memory parallelism in Zig by supporting OpenMP loop directivesDavid Kacs, Joseph Lee, Justs Zarins, Nick Brown2024-09-30下载The Zig programming language, which is designed to provide performance and safety as first class concerns, has become popular in recent years.
Optimizing Cross-Client Domain Coverage for Federated Instruction Tuning of Large Language ModelsZezhou Wang, Yaxin Du, Xingjun Ma, Yugang Jiang, Zhuzhong Qian, Siheng Chen2024-09-30下载Federated domain-specific instruction tuning (FedDIT) for large language models (LLMs) aims to enhance performance in specialized domains using distributed private and limited data, yet identifying ke...
DBNode: A Decentralized Storage System for Big Data Storage in Consortium BlockchainsNarges Dadkhah, Xuyang Ma, Katinka Wolter, Gerhard Wunder2024-09-30下载Storing big data directly on a blockchain poses a substantial burden due to the need to maintain a consistent ledger across all nodes. Numerous studies in decentralized storage systems have been condu...
Experimental Validation of User Experience-focused Dynamic Onboard Service Orchestration for Software Defined VehiclesPierre Laclau, Stéphane Bonnet, Bertrand Ducourthial, Trista Lin, Xiaoting Li2024-09-30下载In response to the growing need for dynamic software features in automobiles, Software Defined Vehicles (SDVs) have emerged as a promising solution.
Edge Intelligence in Satellite-Terrestrial Networks with Hybrid Quantum ComputingSiyue Huang, Lifeng Wang, Xin Wang, Bo Tan, Wei Ni, Kai-Kit Wong2024-09-30下载This paper exploits the potential of edge intelligence empowered satellite-terrestrial networks, where users' computation tasks are offloaded to the satellites or terrestrial base stations.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Meta Reinforcement Learning Approach for Adaptive Resource Optimization in O-RANFatemeh Lotfi, Fatemeh Afghah2024-09-30下载As wireless networks grow to support more complex applications, the Open Radio Access Network (O-RAN) architecture, with its smart RAN Intelligent Controller (RIC) modules, becomes a crucial solution ...
What If We Had Used a Different App? Reliable Counterfactual KPI Analysis in Wireless SystemsQiushuo Hou, Sangwoo Park, Matteo Zecchin, Yunlong Cai, Guanding Yu, Osvaldo Simeone2024-09-30下载In modern wireless network architectures, such as Open Radio Access Network (O-RAN), the operation of the radio access network (RAN) is managed by applications, or apps for short, deployed at intellig...
Multi-Scale Convolutional LSTM with Transfer Learning for Anomaly Detection in Cellular NetworksNooruddin Noonari, Daniel Corujo, Rui L. Aguiar, Francisco J. Ferrao2024-09-30下载The rapid growth in mobile broadband usage and increasing subscribers have made it crucial to ensure reliable network performance. As mobile networks grow more complex, especially during peak hours, m...
Packet Aggregation May Harm Batched Network CodingHoover H. F. Yin2024-09-30下载Batched network coding (BNC) is a solution to multi-hop transmission on networks with packet loss. To be compatible with the existing infrastructure, BNC is usually implemented over UDP.
Age of Gossip with the Push-Pull ProtocolArunabh Srivastava, Thomas Jacob Maranzatto, Sennur Ulukus2024-09-30下载We consider a wireless network where a source generates packets and forwards them to a network containing nn nodes. The nodes in the network use the asynchronous push, pull or push-pull gossip commun...
Beacon based uplink transmission for lorawan direct to satellite internet of thingsMohammad Al Mojamed2024-09-30下载Direct-to-satellite IoT DtS IoT communication structure is a promising solution to provide connectivity and extend the coverage of traditional low-power and long-range technologies, especially for iso...
Machine Learning-enabled Traffic Steering in O-RAN: A Case Study on Hierarchical Learning ApproachMd Arafat Habib, Hao Zhou, Pedro Enrique Iturria-Rivera, Yigit Ozcan, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Melike Erol-Kantarci2024-09-30下载Traffic Steering is a crucial technology for wireless networks, and multiple efforts have been put into developing efficient Machine Learning (ML)-enabled traffic steering schemes for Open Radio Acces...
Diagnosing and Repairing Distributed Routing Configurations Using Selective Symbolic SimulationRulan Yang, Gao Han, Hanyang Shao, Xiaoqiang Zheng, Xing Fang, Ziyi Wang, Lizhao You, Ruiting Zhou, Linghe Kong, Ennan Zhai, Qiao Xiang, Jiwu Shu2024-09-30下载Although substantial progress has been made in automatically verifying whether distributed routing configurations conform to certain requirements, diagnosing and repairing configuration errors remains...
Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic AnalysisBarak Gahtan, Robert J. Shahla, Alex M. Bronstein, Reuven Cohen2024-09-30下载The increasing adoption of the QUIC transport protocol has transformed encrypted web traffic, necessitating new methodologies for network analysis.
Experimental Validation of User Experience-focused Dynamic Onboard Service Orchestration for Software Defined VehiclesPierre Laclau, Stéphane Bonnet, Bertrand Ducourthial, Trista Lin, Xiaoting Li2024-09-30下载In response to the growing need for dynamic software features in automobiles, Software Defined Vehicles (SDVs) have emerged as a promising solution.
Offline Meta-learning for Real-time Bandwidth EstimationAashish Gottipati, Sami Khairy, Yasaman Hosseinkashi, Gabriel Mittag, Vishak Gopal, Francis Y. Yan, Ross Cutler2024-09-30下载Real-time video applications require dynamic bitrate adjustments based on network capacity, necessitating accurate bandwidth estimation (BWE).

cs.PF - Performance

标题作者发布日期PDF摘要
Streaming Data in HPC Workflows Using ADIOSGreg Eisenhauer, Norbert Podhorszki, Ana Gainaru, Scott Klasky, Philip E. Davis, Manish Parashar, Matthew Wolf, Eric Suchtya, Erick Fredj, Vicente Bolea, Franz Pöschel, Klaus Steiniger, Michael Bussmann, Richard Pausch, Sunita Chandrasekaran2024-09-30下载The "IO Wall" problem, in which the gap between computation rate and data access rate grows continuously, poses significant problems to scientific workflows which have traditionally relied upon using ...

基于 VitePress 构建