Skip to content

2024-05-29

cs.AR - Architecture

标题作者发布日期PDF摘要
Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V PlatformViviane Potocnik, Luca Colagrande, Tim Fischer, Luca Bertaccini, Daniele Jahier Pagliari, Alessio Burrello, Luca Benini2024-05-29下载Transformer-based foundation models have become crucial for various domains, most notably natural language processing (NLP) or computer vision (CV).
xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge SystemsGeorg Rutishauser, Joan Mihali, Moritz Scherer, Luca Benini2024-05-29下载Ternary neural networks (TNNs) offer a superior accuracy-energy trade-off compared to binary neural networks. However, until now, they have required specialized accelerators to realize their efficienc...
An Open-Source Framework for Efficient Numerically-Tailored ComputationsLouis Ledoux, Marc Casas2024-05-29下载We present a versatile open-source framework designed to facilitate efficient, numerically-tailored Matrix-Matrix Multiplications (MMMs). The framework offers two primary contributions: first, a fine-...
MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse ModelsTaehyun Kim, Kwanseok Choi, Youngmock Cho, Jaehoon Cho, Hyuk-Jae Lee, Jaewoong Sim2024-05-29下载Mixture-of-Experts (MoE) large language models (LLM) have memory requirements that often exceed the GPU memory capacity, requiring costly parameter movement from secondary memories to the GPU for expe...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Conveyor: Efficient Tool-aware LLM Serving with Tool Partial ExecutionYechen Xu, Xinhao Kong, Tingjun Chen, Danyang Zhuo2024-05-29下载The complexity of large language model (LLM) serving workloads has substantially increased due to the integration with external tool invocations, such as ChatGPT plugins.
Decentralized Optimization in Time-Varying Networks with Arbitrary DelaysTomas Ortega, Hamid Jafarkhani2024-05-29下载We consider a decentralized optimization problem for networks affected by communication delays. Examples of such networks include collaborative machine learning, sensor networks, and multi-agent syste...
Construction of a Byzantine Linearizable SWMR Atomic Register from SWSR Atomic RegistersAjay D. Kshemkalyani, Manaswini Piduguralla, Sathya Peri, Anshuman Misra2024-05-29下载The SWMR atomic register is a fundamental building block in shared memory distributed systems and implementing it from SWSR atomic registers is an important problem.
Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V PlatformViviane Potocnik, Luca Colagrande, Tim Fischer, Luca Bertaccini, Daniele Jahier Pagliari, Alessio Burrello, Luca Benini2024-05-29下载Transformer-based foundation models have become crucial for various domains, most notably natural language processing (NLP) or computer vision (CV).
Differentially Private Clustered Federated LearningSaber Malekmohammadi, Afaf Taik, Golnoosh Farnadi2024-05-29下载Federated learning (FL), which is a decentralized machine learning (ML) approach, often incorporates differential privacy (DP) to provide rigorous data privacy guarantees.
Hybrid-Parallel: Achieving High Performance and Energy Efficient Distributed Inference on RobotsZekai Sun, Xiuxian Guan, Junming Wang, Haoze Song, Yuhao Qing, Tianxiang Shen, Dong Huang, Fangming Liu, Heming Cui2024-05-29下载The rapid advancements in machine learning techniques have led to significant achievements in various real-world robotic tasks. These tasks heavily rely on fast and energy-efficient inference of deep ...
Accelerating Lattice QCD Simulations using GPUsTilmann Matthaei2024-05-29下载Solving discretized versions of the Dirac equation represents a large share of execution time in lattice Quantum Chromodynamics (QCD) simulations.
LoByITFL: Low Communication Secure and Private Federated LearningYue Xia, Maximilian Egger, Christoph Hofmeister, Rawad Bitar2024-05-29下载Privacy of the clients' data and security against Byzantine clients are key challenges in Federated Learning (FL). Existing solutions to joint privacy and security incur sacrifices on the privacy guar...
Multi-Source Coflow Scheduling in Collaborative Edge Computing with Multihop NetworkYuvraj Sahni, Jiannong Cao, Lei Yang, Shengwei Wang2024-05-29下载Collaborative edge computing has become a popular paradigm where edge devices collaborate by sharing resources. Data dissemination is a fundamental problem in CEC to decide what data is transmitted fr...
Learning Interpretable Scheduling Algorithms for Data Processing ClustersZhibo Hu, Chen Wang, Helen, Paik, Yanfeng Shu, Liming Zhu2024-05-29下载Workloads in data processing clusters are often represented in the form of DAG (Directed Acyclic Graph) jobs. Scheduling DAG jobs is challenging.
Federated AssembliesDaniel Halpern, Ariel D. Procaccia, Ehud Shapiro, Nimrod Talmon2024-05-29下载A citizens' assembly is a group of people who are randomly selected to represent a larger population in a deliberation. While this approach has successfully strengthened democracy, it has certain limi...
Optimization-based Proof of Useful Work: Framework, Modeling, and Security AnalysisWeihang Cao, Xintong Ling, Jiaheng Wang, Xiqi Gao, Zhi Ding2024-05-29下载Proof of Work (PoW) has extensively served as the foundation of blockchain's security, consistency, and tamper-resistance. However, long has it been criticized for its tremendous and inefficient utili...
Federated Learning under Partially Class-Disjoint Data via Manifold ReshapingZiqing Fan, Jiangchao Yao, Ruipeng Zhang, Lingjuan Lyu, Ya Zhang, Yanfeng Wang2024-05-29下载Statistical heterogeneity severely limits the performance of federated learning (FL), motivating several explorations e.g., FedProx, MOON and FedDyn, to alleviate this problem.
Federated Learning with Bilateral Curation for Partially Class-Disjoint DataZiqing Fan, Ruipeng Zhang, Jiangchao Yao, Bo Han, Ya Zhang, Yanfeng Wang2024-05-29下载Partially class-disjoint data (PCDD), a common yet under-explored data formation where each client contributes a part of classes (instead of all classes) of samples, severely challenges the performanc...
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware MinimizationZiqing Fan, Shengchao Hu, Jiangchao Yao, Gang Niu, Ya Zhang, Masashi Sugiyama, Yanfeng Wang2024-05-29下载In federated learning (FL), the multi-step update and data heterogeneity among clients often lead to a loss landscape with sharper minima, degenerating the performance of the resulted global model.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
RANFusion: A Comprehensive Tool for Simulating Handover In Next-G RANSeyed Bagher Hashemi Natanzi, Bo Tang2024-05-29下载The rapid advancement of 5G networks and the upcoming transition to 6G necessitate the use of the Open Radio Access Network (O-RAN) architecture to enable greater flexibility, interoperability, and in...
Network Connectivity--Information Freshness Tradeoff in Information Dissemination Over NetworksArunabh Srivastava, Sennur Ulukus2024-05-29下载We consider a gossip network consisting of a source generating updates and nn nodes connected according to a given graph structure. The source keeps updates of a process, that might be generated or o...
EdgeSight: Enabling Modeless and Cost-Efficient Inference at the EdgeChonLam Lao, Jiaqi Gao, Ganesh Ananthanarayanan, Aditya Akella, Minlan Yu2024-05-29下载Traditional ML inference is evolving toward modeless inference, which abstracts the complexity of model selection from users, allowing the system to automatically choose the most appropriate model for...
Multi-Source Coflow Scheduling in Collaborative Edge Computing with Multihop NetworkYuvraj Sahni, Jiannong Cao, Lei Yang, Shengwei Wang2024-05-29下载Collaborative edge computing has become a popular paradigm where edge devices collaborate by sharing resources. Data dissemination is a fundamental problem in CEC to decide what data is transmitted fr...
Preamble Design and Burst-Mode DSP for Upstream Reception of 200G Coherent TDM-PONHaide Wang, Ji Zhou, Jinyang Yang, Zhiyang Liu, Cheng Li, Weiping Liu, Changyuan Yu2024-05-29下载Burst-mode DSP based on 10ns preamble is proposed for upstream reception of 200G coherent TDM-PON. The 128-symbol tone preamble is used for SOP, frequency offset, and sampling phase estimation, while ...
Quantum Circuit Switching with One-Way Repeaters in Star NetworksÁlvaro G. Iñesta, Hyeongrak Choi, Dirk Englund, Stephanie Wehner2024-05-29下载Distributing quantum states reliably among distant locations is a key challenge in the field of quantum networks. One-way quantum networks address this by using one-way communication and quantum error...
To RL or not to RL? An Algorithmic Cheat-Sheet for AI-Based Radio Resource ManagementLorenzo Maggi, Matthew Andrews, Ryo Koblitz2024-05-29下载Several Radio Resource Management (RRM) use cases can be framed as sequential decision planning problems, where an agent (the base station, typically) makes decisions that influence the network utilit...
Optimizing Vehicular Networks with Variational Quantum Circuits-based Reinforcement LearningZijiang Yan, Ramsundar Tanikella, Hina Tabassum2024-05-29下载In vehicular networks (VNets), ensuring both road safety and dependable network connectivity is of utmost importance. Achieving this necessitates the creation of resilient and efficient decision-makin...
User Association and Channel Allocation in 5G Mobile Asymmetric Multi-band Heterogeneous NetworksMiao Dai, Gang Sun, Hongfang Yu, Sheng Wang, Dusit Niyato2024-05-29下载With the proliferation of mobile terminals and the continuous upgrading of services, 4G LTE networks are showing signs of weakness. To enhance the capacity of wireless networks, millimeter waves are i...
FlocOff: Data Heterogeneity Resilient Federated Learning with Communication-Efficient Edge OffloadingMulei Ma, Chenyu Gong, Liekang Zeng, Yang Yang, Liantao Wu2024-05-29下载Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way.
Adaptive and Parallel Split Federated Learning in Vehicular Edge ComputingXianke Qiang, Zheng Chang, Yun Hu, Lei Liu, Timo Hamalainen2024-05-29下载Vehicular edge intelligence (VEI) is a promising paradigm for enabling future intelligent transportation systems by accommodating artificial intelligence (AI) at the vehicular edge computing (VEC) sys...

基于 VitePress 构建