Skip to content

2025-04-10

cs.AR - Architecture

标题作者发布日期PDF摘要
Empowering Vector Architectures for ML: The CAMP Architecture for Matrix MultiplicationMohammadreza Esmali Nojehdeh, Hossein Mokhtarnia, Julian Pavon Rivera, Narcis Rodas Quiroga, Roger Figueras Bagué, Enrico Reggiani, Miquel Moreto, Osman Unsal, Adrian Cristal, Eduard Ayguade2025-04-10下载This study presents the Cartesian Accumulative Matrix Pipeline (CAMP) architecture, a novel approach designed to enhance matrix multiplication in Vector Architectures (VAs) and Single Instruction Mult...
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-DesignYonghao Tan, Pingcheng Dong, Yongkun Wu, Yu Liu, Xuejiao Liu, Peng Luo, Shih-Yang Liu, Xijie Huang, Dong Zhang, Luhong Liang, Kwang-Ting Cheng2025-04-10下载DNN accelerators, significantly advanced by model compression and specialized dataflow techniques, have marked considerable progress. However, the frequent access of high-precision partial sums (PSUMs...
High-Level Synthesis using SDF-AP, Template Haskell, QuasiQuotes, and GADTs to Generate Circuits from Hierarchical Input SpecificationHendrik Folmer2025-04-10下载FPGAs provide highly parallel and customizable hardware solutions but are traditionally programmed using low-level Hardware Description Languages (HDLs) like VHDL and Verilog.
High-Level Synthesis of Digital Circuits from Template Haskell and SDF-APHendrik Folmer, Robert de Groote, Marco Bekooij2025-04-10下载Functional languages as input specifications for High-Level Synthesis (HLS) tools allow to specify data dependencies but do not contain a notion of time nor execution order.
Quadrilatero: A RISC-V programmable matrix coprocessor for low-power edge applicationsDanilo Cammarata, Matteo Perotti, Marco Bertuletti, Angelo Garofalo, Pasquale Davide Schiavone, David Atienza, Luca Benini2025-04-10下载The rapid growth of AI-based Internet-of-Things applications increased the demand for high-performance edge processing engines on a low-power budget and tight area constraints.
Just TestIt! An SBST Approach To Automate System-Integration TestingTommaso Terzano, Luigi Giuffrida, Juan Sapriza, Pasquale Davide Schiavone, Guido Masera, David Atienza, Luciano Lavagno, Maurizio Martina2025-04-10下载This paper introduces TestIt, an open-source Python package designed to automate full-system integration testing using a Software-Based Self-Test (SBST) approach.
A 950 MHz SIMT Soft ProcessorMartin Langhammer, Gregg Baeckler, Kim Bozman2025-04-10下载Although modern FPGAs have a performance potential of a 1 GHz clock frequency - with both clock networks and embedded blocks such as memories and DSP Blocks capable of these clock rates - user impleme...
UniCAIM: A Unified CAM/CIM Architecture with Static-Dynamic KV Cache Pruning for Efficient Long-Context LLM InferenceWeikai Xu, Wenxuan Zeng, Qianqian Huang, Meng Li, Ru Huang2025-04-10下载Transformer-based large language models (LLMs) have achieved impressive performance in various natural language processing (NLP) applications.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Orchestrating Agents and Data for Enterprise: A Blueprint Architecture for Compound AIEser Kandogan, Nikita Bhutani, Dan Zhang, Rafael Li Chen, Sairam Gurajada, Estevam Hruschka2025-04-10下载Large language models (LLMs) have gained significant interest in industry due to their impressive capabilities across a wide range of tasks. However, the widespread adoption of LLMs presents several c...
On Quorum Sizes in DAG-Based BFT ProtocolsRazya Ladelsky, Roy Friedman2025-04-10下载Several prominent DAG-based blockchain protocols, such as DAG-Rider, Tusk, and Bullshark, completely separate between equivocation elimination and committing; equivocation is handled through the use o...
Token Level Routing Inference System for Edge DevicesJianshu She, Wenhao Zheng, Zhengzhong Liu, Hongyi Wang, Eric Xing, Huaxiu Yao, Qirong Ho2025-04-10下载The computational complexity of large language model (LLM) inference significantly constrains their deployment efficiency on edge devices. In contrast, small language models offer faster decoding and ...
GPT Carry-On: Training Foundation Model for Customization Could Be Simple, Scalable and AffordableJianqiao Wangni2025-04-10下载Modern large language foundation models (LLM) have now entered the daily lives of millions of users. We ask a natural question whether it is possible to customize LLM for every user or every task.
Traversal Learning: A Lossless And Efficient Distributed Learning FrameworkErdenebileg Batbaatar, Jeonggeol Kim, Yongcheol Kim, Young Yoon2025-04-10下载In this paper, we introduce Traversal Learning (TL), a novel approach designed to address the problem of decreased quality encountered in popular distributed learning (DL) paradigms such as Federated ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Hybrid Reinforcement Learning-based Sustainable Multi-User Computation Offloading for Mobile Edge-Quantum ComputingMinrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Mingzhe Chen, Dong In Kim, Xuemin, Shen2025-04-10下载Exploiting quantum computing at the mobile edge holds immense potential for facilitating large-scale network design, processing multimodal data, optimizing resource management, and enhancing network s...
Wavelet-Based CSI Reconstruction for Improved Wireless Security Through Channel ReciprocityNora Basha, Bechir Hamdaoui2025-04-10下载The reciprocity of channel state information (CSI) collected by two devices communicating over a wireless channel has been leveraged to provide security solutions to resource-limited IoT devices.
Cellular-X: An LLM-empowered Cellular Agent for Efficient Base Station OperationsLiujianfu Wang, Xinyi Long, Yuyang Du, Xiaoyan Liu, Kexin Chen, Soung Chang Liew2025-04-10下载This paper introduces Cellular-X, an LLM-powered agent designed to automate cellular base station (BS) maintenance. Leveraging multimodal LLM and retrieval-augmented generation (RAG) techniques, Cellu...
NeRF-APT: A New NeRF Framework for Wireless Channel PredictionJingzhou Shen, Tianya Zhao, Yanzhao Wu, Xuyu Wang2025-04-10下载Neural radiance fields (NeRFs) have recently attracted significant attention in the field of wireless channel prediction, primarily due to their capability for high-fidelity reconstruction of complex ...
A Hybrid Semantic RAN Protocol Stack Design for 6G System and Its ImplementationLuhan wang, Haiwen Niu, Zhaoming Lu, Xiangming Wen2025-04-10下载Recently, Semantic Communication (SC) has been recognized as a crucial new paradigm in 6G, significantly improving information transmission efficiency.
MUFFLER: Secure Tor Traffic Obfuscation with Dynamic Connection Shuffling and SplittingMinjae Seo, Myoungsung You, Jaehan Kim, Taejune Park, Seungwon Shin, Jinwoo Kim2025-04-10下载Tor, a widely utilized privacy network, enables anonymous communication but is vulnerable to flow correlation attacks that deanonymize users by correlating traffic patterns from Tor's ingress and egre...
Deep Learning Based Service Composition in Integrated Aerial-Terrestrial NetworksMohammad Farhoudi, Masoud Shokrnezhad, Somayeh Kianpisheh, Tarik Taleb2025-04-10下载The explosive growth of user devices and emerging applications is driving unprecedented traffic demands, accompanied by stringent Quality of Service (QoS) requirements.
LLM-Enabled Data Transmission in End-to-End Semantic CommunicationShavbo Salehi, Melike Erol-Kantarci, Dusit Niyato2025-04-10下载Emerging services such as augmented reality (AR) and virtual reality (VR) have increased the volume of data transmitted in wireless communication systems, revealing the limitations of traditional Shan...
Task-oriented Age of Information for Remote Inference with Hybrid Language ModelsShuying Gan, Xijun Wang, Chenyuan Feng, Chao Xu, Howard H. Yang, Xiang Chen, Tony Q. S. Quek2025-04-10下载Large Language Models (LLMs) have revolutionized the field of artificial intelligence (AI) through their advanced reasoning capabilities, but their extensive parameter sets introduce significant infer...

cs.PF - Performance

标题作者发布日期PDF摘要
Improving Multiresource Job Scheduling with Markovian Service Rate PoliciesZhongrui Chen, Isaac Grosof, Benjamin Berg2025-04-10下载Modern cloud computing workloads are composed of multiresource jobs that require a variety of computational resources in order to run, such as CPU cores, memory, disk space, or hardware accelerators.

基于 VitePress 构建