Appearance
2025-03-07
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| TPU-Gen: LLM-Driven Custom Tensor Processing Unit Generator | Deepak Vungarala, Mohammed E. Elbtity, Sumiya Syed, Sakila Alam, Kartik Pandit, Arnob Ghosh, Ramtin Zand, Shaahin Angizi | 2025-03-07 | 下载 | The increasing complexity and scale of Deep Neural Networks (DNNs) necessitate specialized tensor accelerators, such as Tensor Processing Units (TPUs), to meet various computational and energy efficie... |
| SDT: Cutting Datacenter Tax Through Simultaneous Data-Delivery Threads | Amin Mamandipoor, Huy Dinh Tran, Mohammad Alian | 2025-03-07 | 下载 | Networking is considered a datacenter tax, and hyperscalers push hard to provide high-performance networking with minimal resource expenditure. |
| MatrixFlow: System-Accelerator co-design for high-performance transformer applications | Qunyou Liu, Marina Zapater, David Atienza | 2025-03-07 | 下载 | Transformers are central to advances in artificial intelligence (AI), excelling in fields ranging from computer vision to natural language processing. |
| Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows | Julien Posso, Hugo Kieffer, Nicolas Menga, Omar Hlimi, Sébastien Tarris, Hubert Guerard, Guy Bois, Matthieu Couderc, Eric Jenn | 2025-03-07 | 下载 | This study introduces a lightweight U-Net model optimized for real-time semantic segmentation of aerial images, targeting the efficient utilization of Commercial Off-The-Shelf (COTS) embedded computin... |
| StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination | Yu Feng, Zheng Liu, Weikai Lin, Zihan Liu, Jingwen Leng, Minyi Guo, Zhezhi He, Jieru Zhao, Yuhao Zhu | 2025-03-07 | 下载 | Point clouds are increasingly important in intelligent applications, but frequent off-chip memory traffic in accelerators causes pipeline stalls and leads to high energy consumption. |
| Piccolo: Large-Scale Graph Processing with Fine-Grained In-Memory Scatter-Gather | Changmin Shin, Jaeyong Song, Hongsun Jang, Dogeun Kim, Jun Sung, Taehee Kwon, Jae Hyung Ju, Frank Liu, Yeonkyu Choi, Jinho Lee | 2025-03-07 | 下载 | Graph processing requires irregular, fine-grained random access patterns incompatible with contemporary off-chip memory architecture, leading to inefficient data access. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| VersaSlot: Efficient Fine-grained FPGA Sharing with Big.Little Slots and Live Migration in FPGA Cluster | Jianfeng Gu, Hao Wang, Xiaorang Guo, Martin Schulz, Michael Gerndt | 2025-03-07 | 下载 | As FPGAs gain popularity for on-demand application acceleration in data center computing, dynamic partial reconfiguration (DPR) has become an effective fine-grained sharing technique for FPGA multiple... |
| Practical Federated Learning without a Server | Akash Dhasade, Anne-Marie Kermarrec, Erick Lavoie, Johan Pouwelse, Rishi Sharma, Martijn de Vos | 2025-03-07 | 下载 | Federated Learning (FL) enables end-user devices to collaboratively train ML models without sharing raw data, thereby preserving data privacy. |
| Umbilical Choir: Automated Live Testing for Edge-To-Cloud FaaS Applications | Mohammadreza Malekabbasi, Tobias Pfandzelter, David Bermbach | 2025-03-07 | 下载 | Application users react negatively to performance regressions or availability issues across software releases. To address this, modern cloud-based applications with their multiple daily releases rely ... |
| A Decentralized Sequencer and Data Availability Committee for Rollups Using Set Consensus | Margarita Capretto, Martín Ceresa, Antonio Fernández Anta, Pedro Moreno-Sánchez, César Sánchez | 2025-03-07 | 下载 | Blockchains face a scalability challenge due to the intrinsic throughput limitations of consensus protocols and the limitation in block sizes due to decentralization. |
| Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts | Weigao Sun, Disen Lan, Tong Zhu, Xiaoye Qu, Yu Cheng | 2025-03-07 | 下载 | Linear Sequence Modeling (LSM) like linear attention, state space models and linear RNNs, and Mixture-of-Experts (MoE) have recently emerged as significant architectural improvements. |
| Efficient Parallel Scheduling for Sparse Triangular Solvers | Toni Böhnlein, Pál András Papp, Raphael S. Steiner, Christos K. Matzoros, A. N. Yzelman | 2025-03-07 | 下载 | We develop and analyze new scheduling algorithms for solving sparse triangular linear systems (SpTRSV) in parallel. Our approach produces highly efficient synchronous schedules for the forward- and ba... |
| Optimizing LLM Inference Throughput via Memory-aware and SLA-constrained Dynamic Batching | Bowen Pang, Kai Li, Feifan Wang | 2025-03-07 | 下载 | The increasing adoption of large language models (LLMs) necessitates inference serving systems that can deliver both high throughput and low latency. |
| Uncertainty-Aware Explainable Federated Learning | Yanci Zhang, Han Yu | 2025-03-07 | 下载 | Federated Learning (FL) is a collaborative machine learning paradigm for enhancing data privacy preservation. Its privacy-preserving nature complicates the explanation of the decision-making processes... |
| Dilu: Enabling GPU Resourcing-on-Demand for Serverless DL Serving via Introspective Elasticity | Cunchi Lv, Xiao Shi, Zhengyu Lei, Jinyue Huang, Wenting Tan, Xiaohui Zheng, Xiaofang Zhao | 2025-03-07 | 下载 | Serverless computing, with its ease of management, auto-scaling, and cost-effectiveness, is widely adopted by deep learning (DL) applications. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SDT: Cutting Datacenter Tax Through Simultaneous Data-Delivery Threads | Amin Mamandipoor, Huy Dinh Tran, Mohammad Alian | 2025-03-07 | 下载 | Networking is considered a datacenter tax, and hyperscalers push hard to provide high-performance networking with minimal resource expenditure. |
| REACT: Multi Robot Energy-Aware Orchestrator for Indoor Search and Rescue Critical Tasks | Fabio Maresca, Arnau Romero, Carmen Delgado, Vincenzo Sciancalepore, Josep Paradells, Xavier Costa-Pérez | 2025-03-07 | 下载 | Smart factories enhance production efficiency and sustainability, but emergencies like human errors, machinery failures and natural disasters pose significant risks. |
| RiLoCo: An ISAC-oriented AI Solution to Build RIS-empowered Networks | Guillermo Encinas-Lago, Vincenzo Sciancalepore, Henk Wymeersch, Marco Di Renzo, Xavier Costa-Perez | 2025-03-07 | 下载 | The advance towards 6G networks comes with the promise of unprecedented performance in sensing and communication capabilities. The feat of achieving those, while satisfying the ever-growing demands pl... |
| Wi-Fi 6 Cross-Technology Interference Detection and Mitigation by OFDMA: an Experimental Study | Thijs Havinga, Xianjun Jiao, Wei Liu, Baiheng Chen, Adnan Shahid, Ingrid Moerman | 2025-03-07 | 下载 | Cross-Technology Interference (CTI) poses challenges for the performance and robustness of wireless networks. There are opportunities for better cooperation if the spectral occupation and technology o... |
| Routing for Large ML Models | Ofir Cohen, Jose Yallouz Michael Schapira, Shahar Belkar, Tal Mizrahi | 2025-03-07 | 下载 | Training large language models (LLMs), and other large machine learning models, involves repeated communication of large volumes of data across a data center network. |
| Evaluation of 3D Terrestrial and Aerial Spectrum Sharing with Massive MIMO Systems | Achiel Colpaert, Zhuangzhuang Cui, Sofie Pollin | 2025-03-07 | 下载 | Connecting aerial and terrestrial users with a single base station (BS) is increasingly challenging due to the rising number of aerial users like unmanned aerial vehicles (UAVs). |
| ORANSight-2.0: Foundational LLMs for O-RAN | Pranshav Gajjar, Vijay K. Shah | 2025-03-07 | 下载 | Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Network... |
| Cross-Layer-Optimized Link Selection for Hologram Video Streaming over Millimeter Wave Networks | Yiming Jiang, Yanwei Liu, Jinxia Liu, Antonios Argyriou, Yifei Chen, Wen Zhang | 2025-03-07 | 下载 | Holographic-type communication brings an immersive tele-holography experience by delivering holographic contents to users. As the direct representation of holographic contents, hologram videos are nat... |
| Small noise limits of Markov chains and the PageRank | Vivek S Borkar, S Sowmya, Raghavendra Tripathi | 2025-03-07 | 下载 | We recall the classical formulation of PageRank as the stationary distribution of a singularly perturbed irreducible Markov chain that is not irreducible when the perturbation parameter goes to zero. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HyperGraph ROS: An Open-Source Robot Operating System for Hybrid Parallel Computing based on Computational HyperGraph | Shufang Zhang, Jiazheng Wu, Jiacheng He, Kaiyi Wang, Shan An | 2025-03-07 | 下载 | This paper presents HyperGraph ROS, an open-source robot operating system that unifies intra-process, inter-process, and cross-device computation into a computational hypergraph for efficient message ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Leveraging Approximate Caching for Faster Retrieval-Augmented Generation | Shai Bergman, Anne-Marie Kermarrec, Diana Petrescu, Rafael Pires, Mathis Randl, Martijn de Vos, Ji Zhang | 2025-03-07 | 下载 | Retrieval-augmented generation (RAG) improves the reliability of large language model (LLM) answers by integrating external knowledge. However, RAG increases the end-to-end inference time since lookin... |