Appearance
2025-02-02
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference | Patrick Yubeaton, Tareq Mahmoud, Shehab Naga, Pooria Taheri, Tianhua Xia, Arun George, Yasmein Khalil, Sai Qian Zhang, Siddharth Joshi, Chinmay Hegde, Siddharth Garg | 2025-02-02 | 下载 | As they become more capable, large language models (LLMs) have continued to rapidly increase in size. This has exacerbated the difficulty in running state of the art LLMs on small, edge devices. |
| A Flexible Precision Scaling Deep Neural Network Accelerator with Efficient Weight Combination | Liang Zhao, Kunming Shao, Fengshi Tian, Tim Kwang-Ting Cheng, Chi-Ying Tsui, Yi Zou | 2025-02-02 | 下载 | Deploying mixed-precision neural networks on edge devices is friendly to hardware resources and power consumption. To support fully mixed-precision neural network inference, it is necessary to design ... |
| DeepGate4: Efficient and Effective Representation Learning for Circuit Design at Scale | Ziyang Zheng, Shan Huang, Jianyuan Zhong, Zhengyuan Shi, Guohao Dai, Ningyi Xu, Qiang Xu | 2025-02-02 | 下载 | Circuit representation learning has become pivotal in electronic design automation, enabling critical tasks such as testability analysis, logic reasoning, power estimation, and SAT solving. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ModServe: Modality- and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving | Haoran Qiu, Anish Biswas, Zihan Zhao, Jayashree Mohan, Alind Khare, Esha Choukse, Íñigo Goiri, Zeyu Zhang, Haiying Shen, Chetan Bansal, Ramachandran Ramjee, Rodrigo Fonseca | 2025-02-02 | 下载 | Large multimodal models (LMMs) demonstrate impressive capabilities in understanding images, videos, and audio beyond text. However, efficiently serving LMMs in production environments poses significan... |
| FedRIR: Rethinking Information Representation in Federated Learning | Yongqiang Huang, Zerui Shao, Ziyuan Yang, Zexin Lu, Yi Zhang | 2025-02-02 | 下载 | Mobile and Web-of-Things (WoT) devices at the network edge generate vast amounts of data for machine learning applications, yet privacy concerns hinder centralized model training. |
| ATA: Adaptive Task Allocation for Efficient Resource Management in Distributed Machine Learning | Artavazd Maranjyan, El Mehdi Saad, Peter Richtárik, Francesco Orabona | 2025-02-02 | 下载 | Asynchronous methods are fundamental for parallelizing computations in distributed machine learning. They aim to accelerate training by fully utilizing all available resources. |
| Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs | Youhe Jiang, Fangcheng Fu, Xiaozhe Yao, Guoliang He, Xupeng Miao, Ana Klimovic, Bin Cui, Binhang Yuan, Eiko Yoneki | 2025-02-02 | 下载 | Recent advancements in Large Language Models (LLMs) have led to increasingly diverse requests, accompanied with varying resource (compute and memory) demands to serve them. |
| DeLIAP e DeLIAJ: Interfaces de biblioteca de Dependabilidade para Python e Julia | Marcos Irigoyen, Carla Santana, Ramon C. F Araújo, Samuel Xavier-de-Souza | 2025-02-02 | 下载 | The evergrowing computational complexity of High Performance Computing applications is often met with an horizontal scalling of computing systems. |
| Optimal local certification on graphs of bounded pathwidth | Dan Alden Baterisna, Yi-Jun Chang | 2025-02-02 | 下载 | We present proof labeling schemes for graphs with bounded pathwidth that can decide any graph property expressible in monadic second-order (MSO) logic using -bit vertex labels. |
| POSMAC: Powering Up In-Network AR/CG Traffic Classification with Online Learning | Alireza Shirmarz, Fabio Luciano Verdi, Suneet Kumar Singh, Christian Esteve Rothenberg | 2025-02-02 | 下载 | In this demonstration, we showcase POSMAC1, a platform designed to deploy Decision Tree (DT) and Random Forest (RF) models on the NVIDIA DOCA DPU, equipped with an ARM processor, for real-time network... |
| General Coded Computing in a Probabilistic Straggler Regime | Parsa Moradi, Mohammad Ali Maddah-Ali | 2025-02-02 | 下载 | Coded computing has demonstrated promising results in addressing straggler resiliency in distributed computing systems. However, most coded computing schemes are designed for exact computation, requir... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Improving SDN Performance Using Network Coding: A Quantitative Analysis | Amer T. Ali, Qutaiba I. Ali | 2025-02-02 | 下载 | Software Defined Networking or SDN is an architectural approach to managing the network where the control and forwarding are different planes that are controlled through an application interface. |
| Mathematical Modeling for Network Upgrades in Internet Service Provider Infrastructure | Omar M Malallah, Qutaiba I. Ali | 2025-02-02 | 下载 | The ongoing growth of the need for superior Internet services creates great pressure on the ISPs as to the accurate estimation of network upgrade need. |
| Forecasting Global Network Traffic Trends: The Role of Virtual Reality | Raghad H. AlShekh, Qutaiba I. Ali | 2025-02-02 | 下载 | Virtual Reality (VR) technology demands real-time data transmission to deliver an immersive and interactive user experience. This study investigates the implementation of UDP Ethernet communication in... |
| REAL: Reinforcement Learning-Enabled xApps for Experimental Closed-Loop Optimization in O-RAN with OSC RIC and srsRAN | Ryan Barker, Alireza Ebrahimi Dorcheh, Tolunay Seyfi, Fatemeh Afghah | 2025-02-02 | 下载 | Open Radio Access Network (O-RAN) offers an open, programmable architecture for next-generation wireless networks, enabling advanced control through AI-based applications on the near-Real-Time RAN Int... |
| CardioLive: Empowering Video Streaming with Online Cardiac Monitoring | Sheng Lyu, Ruiming Huang, Sijie Ji, Yasar Abbas Ur Rehman, Lan Ma, Chenshu Wu | 2025-02-02 | 下载 | Online Cardiac Monitoring (OCM) emerges as a compelling enhancement for the next-generation video streaming platforms. It enables various applications including remote health, online affective computi... |
| POSMAC: Powering Up In-Network AR/CG Traffic Classification with Online Learning | Alireza Shirmarz, Fabio Luciano Verdi, Suneet Kumar Singh, Christian Esteve Rothenberg | 2025-02-02 | 下载 | In this demonstration, we showcase POSMAC1, a platform designed to deploy Decision Tree (DT) and Random Forest (RF) models on the NVIDIA DOCA DPU, equipped with an ARM processor, for real-time network... |
| Congestion Management in High-Performance Interconnection Networks Using Adaptive Routing Notifications | Jose Rocher-Gonzalez, Jesus Escudero-Sahuquillo, Pedro J. Garcia, Francisco J. Quiles | 2025-02-02 | 下载 | The interconnection network is a crucial subsystem in High-Performance Computing clusters and Data-centers, guaranteeing high bandwidth and low latency to the applications' communication operations. |
| Using Causality for Enhanced Prediction of Web Traffic Time Series | Chang Tian, Mingzhe Xing, Zenglin Shi, Matthew B. Blaschko, Yinliang Yue, Marie-Francine Moens | 2025-02-02 | 下载 | Predicting web service traffic has significant social value, as it can be applied to various practical scenarios, including but not limited to dynamic resource scaling, load balancing, system anomaly ... |
| Hades: Hierarchical Adaptable Decoding for Efficient and Elastic vRAN | Jincao Zhu, Kobus Van Der Merwe, Xenofon Foukas, Bozidar Radunovic | 2025-02-02 | 下载 | In cellular networks, virtualized Radio Access Networks (vRANs) enable replacing traditional specialized hardware at cell sites with software running on commodity servers distributed across edge and r... |