Appearance
2025-02-26
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Vision Transformers on the Edge: A Comprehensive Survey of Model Compression and Acceleration Strategies | Shaibal Saha, Lanyu Xu | 2025-02-26 | 下载 | In recent years, vision transformers (ViTs) have emerged as powerful and promising techniques for computer vision tasks such as image classification, object detection, and segmentation. |
| FPGA-based Emulation and Device-Side Management for CXL-based Memory Tiering Systems | Yiqi Chen, Xiping Dong, Zhe Zhou, Zhao Wang, Jie Zhang, Guangyu Sun | 2025-02-26 | 下载 | The Compute Express Link (CXL) technology facilitates the extension of CPU memory through byte-addressable SerDes links and cascaded switches, creating complex heterogeneous memory systems where CPU a... |
| A Multicast-Capable AXI Crossbar for Many-core Machine Learning Accelerators | Luca Colagrande, Luca Benini | 2025-02-26 | 下载 | To keep up with the growing computational requirements of machine learning workloads, many-core accelerators integrate an ever-increasing number of processing elements, putting the efficiency of memor... |
| Evaluation of CGRA Toolchains | Dominik Walter, Marita Halm, Daniel Seidel, Indrayudh Ghosh, Christian Heidorn, Frank Hannig, Jürgen Teich | 2025-02-26 | 下载 | Increasing demands for computing power also propel the need for energy-efficient SoC accelerator architectures. One class for such accelerators are so-called processor arrays, which typically integrat... |
| 3D-TrIM: A Memory-Efficient Spatial Computing Architecture for Convolution Workloads | Cristian Sestito, Ahmed J. Abdelmaksoud, Shady Agwa, Themis Prodromakis | 2025-02-26 | 下载 | The Von Neumann bottleneck, which relates to the energy cost of moving data from memory to on-chip core and vice versa, is a serious challenge in state-of-the-art AI architectures, like Convolutional ... |
| A Reliable, Time-Predictable Heterogeneous SoC for AI-Enhanced Mixed-Criticality Edge Applications | Angelo Garofalo, Alessandro Ottaviano, Matteo Perotti, Thomas Benz, Yvan Tortorella, Robert Balas, Michael Rogenmoser, Chi Zhang, Luca Bertaccini, Nils Wistoff, Maicol Ciani, Cyril Koenig, Mattia Sinigaglia, Luca Valente, Paul Scheffler, Manuel Eggimann, Matheus Cavalcante, Francesco Restuccia, Alessandro Biondi, Francesco Conti, Frank K. Gurkaynak, Davide Rossi, Luca Benini | 2025-02-26 | 下载 | Next-generation mixed-criticality Systems-on-chip (SoCs) for robotics, automotive, and space must execute mixed-criticality AI-enhanced sensor processing and control workloads, ensuring reliable and t... |
| M-ANT: Efficient Low-bit Group Quantization for LLMs via Mathematically Adaptive Numerical Type | Weiming Hu, Haoyan Zhang, Cong Guo, Yu Feng, Renyang Guan, Zhendong Hua, Zihan Liu, Yue Guan, Minyi Guo, Jingwen Leng | 2025-02-26 | 下载 | Large language models (LLMs) are one of the most important killer computer applications. The recent algorithmic advancement proposes a fine-grained group-wise quantization for LLMs, which treats a sma... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HDEE: Heterogeneous Domain Expert Ensemble | Oğuzhan Ersoy, Jari Kolehmainen, Gabriel Passamani Andrade | 2025-02-26 | 下载 | Training dense LLMs requires enormous amounts of data and centralized compute, which introduces fundamental bottlenecks and ever-growing costs for large models. |
| Algorithms for Parallel Shared-Memory Sparse Matrix-Vector Multiplication on Unstructured Matrices | Kobe Bergmans, Karl Meerbergen, Raf Vandebril | 2025-02-26 | 下载 | The sparse matrix-vector (SpMV) multiplication is an important computational kernel, but it is notoriously difficult to execute efficiently. This paper investigates algorithm performance for unstructu... |
| Efficient Federated Search for Retrieval-Augmented Generation | Rachid Guerraoui, Anne-Marie Kermarrec, Diana Petrescu, Rafael Pires, Mathis Randl, Martijn de Vos | 2025-02-26 | 下载 | Large language models (LLMs) have demonstrated remarkable capabilities across various domains but remain susceptible to hallucinations and inconsistencies, limiting their reliability. |
| FedCDC: A Collaborative Framework for Data Consumers in Federated Learning Market | Zhuan Shi, Patrick Ohl, Boi Faltings | 2025-02-26 | 下载 | Federated learning (FL) allows machine learning models to be trained on distributed datasets without directly accessing local data. In FL markets, numerous Data Consumers compete to recruit Data Owner... |
| A Reliable, Time-Predictable Heterogeneous SoC for AI-Enhanced Mixed-Criticality Edge Applications | Angelo Garofalo, Alessandro Ottaviano, Matteo Perotti, Thomas Benz, Yvan Tortorella, Robert Balas, Michael Rogenmoser, Chi Zhang, Luca Bertaccini, Nils Wistoff, Maicol Ciani, Cyril Koenig, Mattia Sinigaglia, Luca Valente, Paul Scheffler, Manuel Eggimann, Matheus Cavalcante, Francesco Restuccia, Alessandro Biondi, Francesco Conti, Frank K. Gurkaynak, Davide Rossi, Luca Benini | 2025-02-26 | 下载 | Next-generation mixed-criticality Systems-on-chip (SoCs) for robotics, automotive, and space must execute mixed-criticality AI-enhanced sensor processing and control workloads, ensuring reliable and t... |
| CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning | Ping Zhang, Zhaorui Zhang, Sheng Di, Yao Xin, Benben Liu | 2025-02-26 | 下载 | Large language model fine-tuning has been identified as an efficient approach to applying the pre-trained Large language models to other domains. |
| Research on Edge Computing and Cloud Collaborative Resource Scheduling Optimization Based on Deep Reinforcement Learning | Yuqing Wang, Xiao Yang | 2025-02-26 | 下载 | This study addresses the challenge of resource scheduling optimization in edge-cloud collaborative computing using deep reinforcement learning (DRL). |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| On Supporting IP Routing in the Next Generation of Mobile Systems | Hamed Hellaoui, Matti Laitila, Markus Isomäki, Hua Chao | 2025-02-26 | 下载 | The upcoming generation of mobile telecommunication systems is expected to support new use cases, where the mobile network serves one or more IP subnetworks located behind the User Equipment (UEs). |
| Formowanie klastrów obsługujących w sieci 6g open ran o architekturze zorientowanej na użytkownika | Marcin Hoffmann | 2025-02-26 | 下载 | One of the main challenges associated with the implementation of an Open RAN User-Centric Cell-Free network is the appropriate formulation of serving clusters. |
| A Multi-Agent DRL-Based Framework for Optimal Resource Allocation and Twin Migration in the Multi-Tier Vehicular Metaverse | Nahom Abishu Hayla, A. Mohammed Seid, Aiman Erbad, Tilahun M. Getu, Ala Al-Fuqaha, Mohsen Guizani | 2025-02-26 | 下载 | Although multi-tier vehicular Metaverse promises to transform vehicles into essential nodes -- within an interconnected digital ecosystem -- using efficient resource allocation and seamless vehicular ... |
| Sequential Entanglement-Swapping assisted by Quantum Protocol over Ethernet Networks | Kun Chen-Hu, Kristian S. Jensen, Petar Popovski | 2025-02-26 | 下载 | The integration of quantum communication protocols over Ethernet networks is proposed, showing the potential of combining classical and quantum technologies for efficient, scalable quantum networking. |
| O-RIS-ing: Evaluating RIS-Assisted NextG Open RAN | Maria Tsampazi, Michele Polese, Falko Dressler, Tommaso Melodia | 2025-02-26 | 下载 | Reconfigurable Intelligent Surfaces (RISs) pose as a transformative technology to revolutionize the cellular architecture of Next Generation (NextG) Radio Access Networks (RANs). |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Safe and usable kernel extensions with Rex | Jinghao Jia, Ruowen Qin, Milo Craun, Egor Lukiyanov, Ayush Bansal, Michael V. Le, Hubertus Franke, Hani Jamjoom, Tianyin Xu, Dan Williams | 2025-02-26 | 下载 | Safe kernel extensions have gained significant traction, evolving from simple packet filters to large, complex programs that customize storage, networking, and scheduling. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Online Pseudo-average Shifting Attention(PASA) for Robust Low-precision LLM Inference: Algorithms and Numerical Analysis | Long Cheng, Qichen Liao, Fan Wu, Junlin Mu, Tengfei Han, Zhe Qiu, Lianqiang Li, Tianyi Liu, Fangzheng Miao, Keming Gao, Liang Wang, Zhen Zhang, Qiande Yin | 2025-02-26 | 下载 | Attention calculation is extremely time-consuming for long-sequence inference tasks, such as text or image/video generation, in large models. To accelerate this process, we developed a low-precision, ... |