Skip to content

2024-11-27

cs.AR - Architecture

标题作者发布日期PDF摘要
Mixture of Cache-Conditional Experts for Efficient Mobile Device InferenceAndrii Skliar, Ties van Rozendaal, Romain Lepert, Todor Boinovski, Mart van Baalen, Markus Nagel, Paul Whatmough, Babak Ehteshami Bejnordi2024-11-27下载Mixture of Experts (MoE) LLMs have recently gained attention for their ability to enhance performance by selectively engaging specialized subnetworks or "experts" for each input.
Optimising Iteration Scheduling for Full-State Vector Simulation of Quantum Circuits on FPGAsYoussef Moawad, Andrew Brown, René Steijl, Wim Vanderbauwhede2024-11-27下载As the field of quantum computing grows, novel algorithms which take advantage of quantum phenomena need to be developed. As we are currently in the NISQ (noisy intermediate scale quantum) era, quantu...
QUADOL: A Quality-Driven Approximate Logic Synthesis Method Exploiting Dual-Output LUTs for Modern FPGAsJian Shi, Xuan Wang, Chang Meng, Weikang Qian2024-11-27下载Approximate computing is a new computing paradigm. One important area of it is designing approximate circuits for FPGA. Modern FPGAs support dual-output LUT, which can significantly reduce the area of...
CXL-Interference: Analysis and Characterization in Modern Computer SystemsShunyu Mao, Jiajun Luo, Yixin Li, Jiapeng Zhou, Weidong Zhang, Zheng Liu, Teng Ma, Shuwen Deng2024-11-27下载Compute Express Link (CXL) is a promising technology that addresses memory and storage challenges. Despite its advantages, CXL faces performance threats from external interference when co-existing wit...
Efficient Nonlinear Function Approximation in Analog Resistive Crossbars for Recurrent Neural NetworksJunyi Yang, Ruibin Mao, Mingrui Jiang, Yichuan Cheng, Pao-Sheng Vincent Sun, Shuai Dong, Giacomo Pedretti, Xia Sheng, Jim Ignowski, Haoliang Li, Can Li, Arindam Basu2024-11-27下载Analog In-memory Computing (IMC) has demonstrated energy-efficient and low latency implementation of convolution and fully-connected layers in deep neural networks (DNN) by using physics for computing...
A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAsEhsan Kabir, Jason D. Bakos, David Andrews, Miaoqing Huang2024-11-27下载Transformer neural networks (TNN) excel in natural language processing (NLP), machine translation, and computer vision (CV) without relying on recurrent or convolutional layers.
A 65-nm Reliable 6T CMOS SRAM Cell with Minimum Size TransistorsGabriel Torrens, Bartomeu Alorda, Cristian Carmona, Daniel Malagon-Perianez, Jaume Segura, Sebastia Antoni Bota2024-11-27下载As minimum area SRAM bit-cells are obtained when using cell ratio and pull-up ratio of 1, we analyze the possibility of decreasing the cell ratio from the conventional values comprised between 1.5-2.
High-Level Surface Code Decoding via Parallel FFNNs on CIM PlatformsHao Wang, Erjia Xiao, Wenbo Mu, Songhuan He, Zhongyi Ni, Lingfeng Zhang, Xiaokun Zhan, Yifei Cui, Jinguo Liu, Cheng Wang, Zhongrui Wang, Renjing Xu2024-11-27下载Due to the high sensitivity of qubits to environmental noise, which leads to decoherence and information loss, active quantum error correction(QEC) is essential.
FlexiBit: Fully Flexible Precision Bit-parallel Accelerator Architecture for Arbitrary Mixed Precision AIFaraz Tahmasebi, Yian Wang, Benji Y. H. Huang, Hyoukjun Kwon2024-11-27下载Recent research has shown that large language models (LLMs) can utilize low-precision floating point (FP) quantization to deliver high efficiency while maintaining original model accuracy.
Reconfigurable Stream Network ArchitectureChengyue Wang, Xiaofan Zhang, Jason Cong, James C. Hoe2024-11-27下载As AI systems grow increasingly specialized and complex, managing hardware heterogeneity becomes a pressing challenge. How can we efficiently coordinate and synchronize heterogeneous hardware resource...
Calibrating DRAMPower Model for HPC: A Runtime Perspective from Real-Time MeasurementsXinyu Shi, Dina Ali Abdelhamid, Thomas Ilsche, Saeideh Alinezhad Chamazcoti, Timon Evenblij, Mohit Gupta, Francky Catthoor2024-11-27下载Main memory's rising energy consumption has emerged as a critical challenge in modern computing architectures, particularly in large-scale systems, driven by frequent access patterns, growing data vol...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Formal Verification of Digital Twins with TLA and Information Leakage ControlLuwen Huang, Lav R. Varshney, Karen E. Willcox2024-11-27下载Verifying the correctness of a digital twin provides a formal guarantee that the digital twin operates as intended. Digital twin verification is challenging due to the presence of uncertainties in the...
Locally Differentially Private Online Federated Learning With Correlated NoiseJiaojiao Zhang, Linglingzhi Zhu, Dominik Fay, Mikael Johansson2024-11-27下载We introduce a locally differentially private (LDP) algorithm for online federated learning that employs temporally correlated noise to improve utility while preserving privacy.
CkIO: Parallel File Input for Over-Decomposed Task-Based SystemsMathew Jacob, Maya Taylor, Laxmikant Kale2024-11-27下载Parallel input performance issues are often neglected in large scale parallel applications in Computational Science and Engineering. Traditionally, there has been less focus on input performance becau...
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model ServingAo Shen, Zhiyao Li, Mingyu Gao2024-11-27下载Serving numerous users and requests concurrently requires good fairness in Large Language Models (LLMs) serving system. This ensures that, at the same cost, the system can meet the Service Level Objec...
Using Malware Detection Techniques for HPC Application ClassificationThomas Jakobsche, Florina M. Ciorba2024-11-27下载HPC systems face security and compliance challenges, particularly in preventing waste and misuse of computational resources by unauthorized or malicious software that deviates from allocation purpose.
Energy-Efficient Split Learning for Fine-Tuning Large Language Models in Edge NetworksZuguang Li, Shaohua Wu, Liang Li, Songge Zhang2024-11-27下载In this letter, we propose an energy-efficient split learning (SL) framework for fine-tuning large language models (LLMs) using geo-distributed personal data at the network edge, where LLMs are split ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Integrated Heterogeneous Service Provisioning: Unifying Beyond-Communication Capabilities with MDMA in 6G and Future Wireless NetworksPengyi Jia, Xianbin Wang, Yongxu Zhu, Shi Jin, Robert Schober2024-11-27下载The rapid evolution and convergence of wireless technologies and vertical applications have fundamentally reshaped our lifestyles and industries.
Optimal In-Network Distribution of Learning Functions for a Secure-by-Design Programmable Data Plane of Next-Generation NetworksMattia Giovanni Spina, Edoardo Scalzo, Floriano De Rango, Francesca Guerriero, Antonio Iera2024-11-27下载The rise of programmable data plane (PDP) and in-network computing (INC) paradigms paves the way for the development of network devices (switches, network interface cards, etc.
OpenOptics: An Open Research Framework for Optical Data Center NetworksYiming Lei, Federico De Marchi, Jialong Li, Raj Joshi, Balakrishnan Chandrasekaran, Yiting Xia2024-11-27下载Optical data center networks (DCNs) are emerging as a promising design for cloud infrastructure. However, existing optical DCN architectures operate as closed ecosystems, tying software solutions to s...
Semantic Edge Computing and Semantic Communications in 6G Networks: A Unifying Survey and Research ChallengesMilin Zhang, Mohammad Abdi, Venkat R. Dasari, Francesco Restuccia2024-11-27下载Semantic Edge Computing (SEC) and Semantic Communications (SemComs) have been proposed as viable approaches to achieve real-time edge-enabled intelligence in sixth-generation (6G) wireless networks.
Edge-Assisted Accelerated Cooperative Sensing for CAVs: Task Placement and Resource AllocationYuxuan Wang, Kaige Qu, Wen Wu, Xuemin, Shen2024-11-27下载In this paper, we propose a novel road side unit (RSU)-assisted cooperative sensing scheme for connected autonomous vehicles (CAVs), with the objective to reduce completion time of sensing tasks.
Online Adaptive Real-Time Beamforming Design for Dynamic Environments in Cell-Free SystemsGuanghui Chen, Zheng Wang, Hongxin Lin, Pengguang Du, Yongming Huang2024-11-27下载In this paper, we consider real-time beamforming design for dynamic wireless environments with varying channels and different numbers of access points (APs) and users in cell-free systems.
P4-NIDS: High-Performance Network Monitoring and Intrusion Detection in P4Yaying Chen, Siamak Layeghy, Liam Daly Manocchio, Marius Portmann2024-11-27下载This paper presents a high-performance, scalable network monitoring and intrusion detection system (IDS) implemented in P4. The proposed solution is designed for high-performance environments such as ...
Durbin: Internet Outage Detection with Adaptive Passive AnalysisAsma Enayet, John Heidemann2024-11-27下载Measuring Internet outages is important to allow ISPs to improve their services, users to choose providers by reliability, and governments to understand the reliability of their infrastructure.

cs.PF - Performance

标题作者发布日期PDF摘要
CXL-Interference: Analysis and Characterization in Modern Computer SystemsShunyu Mao, Jiajun Luo, Yixin Li, Jiapeng Zhou, Weidong Zhang, Zheng Liu, Teng Ma, Shuwen Deng2024-11-27下载Compute Express Link (CXL) is a promising technology that addresses memory and storage challenges. Despite its advantages, CXL faces performance threats from external interference when co-existing wit...
Randomized-Grid Search for Hyperparameter Tuning in Decision Tree Model to Improve Performance of Cardiovascular Disease ClassificationAbhay Kumar Pathak, Mrityunjay Chaubey, Manjari Gupta2024-11-27下载Cardiovascular disease refers to any critical condition that impacts the heart. Because heart diseases can be life-threatening. Researchers are focusing on designing smart systems to accurately diagno...

基于 VitePress 构建