2025-04-10

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Empowering Vector Architectures for ML: The CAMP Architecture for Matrix Multiplication	Mohammadreza Esmali Nojehdeh, Hossein Mokhtarnia, Julian Pavon Rivera, Narcis Rodas Quiroga, Roger Figueras Bagué, Enrico Reggiani, Miquel Moreto, Osman Unsal, Adrian Cristal, Eduard Ayguade	2025-04-10	下载	This study presents the Cartesian Accumulative Matrix Pipeline (CAMP) architecture, a novel approach designed to enhance matrix multiplication in Vector Architectures (VAs) and Single Instruction Mult...
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-Design	Yonghao Tan, Pingcheng Dong, Yongkun Wu, Yu Liu, Xuejiao Liu, Peng Luo, Shih-Yang Liu, Xijie Huang, Dong Zhang, Luhong Liang, Kwang-Ting Cheng	2025-04-10	下载	DNN accelerators, significantly advanced by model compression and specialized dataflow techniques, have marked considerable progress. However, the frequent access of high-precision partial sums (PSUMs...
High-Level Synthesis using SDF-AP, Template Haskell, QuasiQuotes, and GADTs to Generate Circuits from Hierarchical Input Specification	Hendrik Folmer	2025-04-10	下载	FPGAs provide highly parallel and customizable hardware solutions but are traditionally programmed using low-level Hardware Description Languages (HDLs) like VHDL and Verilog.
High-Level Synthesis of Digital Circuits from Template Haskell and SDF-AP	Hendrik Folmer, Robert de Groote, Marco Bekooij	2025-04-10	下载	Functional languages as input specifications for High-Level Synthesis (HLS) tools allow to specify data dependencies but do not contain a notion of time nor execution order.
Quadrilatero: A RISC-V programmable matrix coprocessor for low-power edge applications	Danilo Cammarata, Matteo Perotti, Marco Bertuletti, Angelo Garofalo, Pasquale Davide Schiavone, David Atienza, Luca Benini	2025-04-10	下载	The rapid growth of AI-based Internet-of-Things applications increased the demand for high-performance edge processing engines on a low-power budget and tight area constraints.
Just TestIt! An SBST Approach To Automate System-Integration Testing	Tommaso Terzano, Luigi Giuffrida, Juan Sapriza, Pasquale Davide Schiavone, Guido Masera, David Atienza, Luciano Lavagno, Maurizio Martina	2025-04-10	下载	This paper introduces TestIt, an open-source Python package designed to automate full-system integration testing using a Software-Based Self-Test (SBST) approach.
A 950 MHz SIMT Soft Processor	Martin Langhammer, Gregg Baeckler, Kim Bozman	2025-04-10	下载	Although modern FPGAs have a performance potential of a 1 GHz clock frequency - with both clock networks and embedded blocks such as memories and DSP Blocks capable of these clock rates - user impleme...
UniCAIM: A Unified CAM/CIM Architecture with Static-Dynamic KV Cache Pruning for Efficient Long-Context LLM Inference	Weikai Xu, Wenxuan Zeng, Qianqian Huang, Meng Li, Ru Huang	2025-04-10	下载	Transformer-based large language models (LLMs) have achieved impressive performance in various natural language processing (NLP) applications.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AI	Eser Kandogan, Nikita Bhutani, Dan Zhang, Rafael Li Chen, Sairam Gurajada, Estevam Hruschka	2025-04-10	下载	Large language models (LLMs) have gained significant interest in industry due to their impressive capabilities across a wide range of tasks. However, the widespread adoption of LLMs presents several c...
On Quorum Sizes in DAG-Based BFT Protocols	Razya Ladelsky, Roy Friedman	2025-04-10	下载	Several prominent DAG-based blockchain protocols, such as DAG-Rider, Tusk, and Bullshark, completely separate between equivocation elimination and committing; equivocation is handled through the use o...
Token Level Routing Inference System for Edge Devices	Jianshu She, Wenhao Zheng, Zhengzhong Liu, Hongyi Wang, Eric Xing, Huaxiu Yao, Qirong Ho	2025-04-10	下载	The computational complexity of large language model (LLM) inference significantly constrains their deployment efficiency on edge devices. In contrast, small language models offer faster decoding and ...
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and Affordable	Jianqiao Wangni	2025-04-10	下载	Modern large language foundation models (LLM) have now entered the daily lives of millions of users. We ask a natural question whether it is possible to customize LLM for every user or every task.
Traversal Learning: A Lossless And Efficient Distributed Learning Framework	Erdenebileg Batbaatar, Jeonggeol Kim, Yongcheol Kim, Young Yoon	2025-04-10	下载	In this paper, we introduce Traversal Learning (TL), a novel approach designed to address the problem of decreased quality encountered in popular distributed learning (DL) paradigms such as Federated ...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Hybrid Reinforcement Learning-based Sustainable Multi-User Computation Offloading for Mobile Edge-Quantum Computing	Minrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Mingzhe Chen, Dong In Kim, Xuemin, Shen	2025-04-10	下载	Exploiting quantum computing at the mobile edge holds immense potential for facilitating large-scale network design, processing multimodal data, optimizing resource management, and enhancing network s...
Wavelet-Based CSI Reconstruction for Improved Wireless Security Through Channel Reciprocity	Nora Basha, Bechir Hamdaoui	2025-04-10	下载	The reciprocity of channel state information (CSI) collected by two devices communicating over a wireless channel has been leveraged to provide security solutions to resource-limited IoT devices.
Cellular-X: An LLM-empowered Cellular Agent for Efficient Base Station Operations	Liujianfu Wang, Xinyi Long, Yuyang Du, Xiaoyan Liu, Kexin Chen, Soung Chang Liew	2025-04-10	下载	This paper introduces Cellular-X, an LLM-powered agent designed to automate cellular base station (BS) maintenance. Leveraging multimodal LLM and retrieval-augmented generation (RAG) techniques, Cellu...
NeRF-APT: A New NeRF Framework for Wireless Channel Prediction	Jingzhou Shen, Tianya Zhao, Yanzhao Wu, Xuyu Wang	2025-04-10	下载	Neural radiance fields (NeRFs) have recently attracted significant attention in the field of wireless channel prediction, primarily due to their capability for high-fidelity reconstruction of complex ...
A Hybrid Semantic RAN Protocol Stack Design for 6G System and Its Implementation	Luhan wang, Haiwen Niu, Zhaoming Lu, Xiangming Wen	2025-04-10	下载	Recently, Semantic Communication (SC) has been recognized as a crucial new paradigm in 6G, significantly improving information transmission efficiency.
MUFFLER: Secure Tor Traffic Obfuscation with Dynamic Connection Shuffling and Splitting	Minjae Seo, Myoungsung You, Jaehan Kim, Taejune Park, Seungwon Shin, Jinwoo Kim	2025-04-10	下载	Tor, a widely utilized privacy network, enables anonymous communication but is vulnerable to flow correlation attacks that deanonymize users by correlating traffic patterns from Tor's ingress and egre...
Deep Learning Based Service Composition in Integrated Aerial-Terrestrial Networks	Mohammad Farhoudi, Masoud Shokrnezhad, Somayeh Kianpisheh, Tarik Taleb	2025-04-10	下载	The explosive growth of user devices and emerging applications is driving unprecedented traffic demands, accompanied by stringent Quality of Service (QoS) requirements.
LLM-Enabled Data Transmission in End-to-End Semantic Communication	Shavbo Salehi, Melike Erol-Kantarci, Dusit Niyato	2025-04-10	下载	Emerging services such as augmented reality (AR) and virtual reality (VR) have increased the volume of data transmitted in wireless communication systems, revealing the limitations of traditional Shan...
Task-oriented Age of Information for Remote Inference with Hybrid Language Models	Shuying Gan, Xijun Wang, Chenyuan Feng, Chao Xu, Howard H. Yang, Xiang Chen, Tony Q. S. Quek	2025-04-10	下载	Large Language Models (LLMs) have revolutionized the field of artificial intelligence (AI) through their advanced reasoning capabilities, but their extensive parameter sets introduce significant infer...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Improving Multiresource Job Scheduling with Markovian Service Rate Policies	Zhongrui Chen, Isaac Grosof, Benjamin Berg	2025-04-10	下载	Modern cloud computing workloads are composed of multiresource jobs that require a variety of computational resources in order to run, such as CPU cores, memory, disk space, or hardware accelerators.