Skip to content

2024-12-03

cs.AR - Architecture

标题作者发布日期PDF摘要
Massimult: A Novel Parallel CPU Architecture Based on Combinator ReductionJurgen Nicklisch-Franken, Ruslan Feizerakhmanov2024-12-03下载The Massimult project aims to design and implement an innovative CPU architecture based on combinator reduction with a novel combinator base and a new abstract machine.
The Tiny Median Filter: A Small Size, Flexible Arbitrary Percentile Finder Scheme Suitable for FPGA ImplementationJinyuan Wu2024-12-03下载This document reports the design, implementation and testing of a small silicon resource usage, very flexible arbitrary percentile finding scheme called the Tiny Median Filter.
PrefixLLM: LLM-aided Prefix Circuit DesignWeihua Xiao, Venkata Sai Charan Putrevu, Raghu Vamshi Hemadri, Siddharth Garg, Ramesh Karri2024-12-03下载Prefix circuits are fundamental components in digital adders, widely used in digital systems due to their efficiency in calculating carry signals.
ML-based AIG Timing Prediction to Enhance Logic OptimizationWenjing Jiang, Jin Yan, Sachin S. Sapatnekar2024-12-03下载As circuit designs become more intricate, obtaining accurate performance estimation in early stages, for effective design space exploration, becomes more time-consuming.
MASIM: An Efficient Multi-Array Scheduler for In-Memory SIMD ComputationXingyue Qian, Chen Nie, Zhezhi He, Weikang Qian2024-12-03下载Single instruction, multiple data (SIMD) is a popular design style of in-memory computing (IMC) architectures, which enables memory arrays to perform logic operations to achieve low energy consumption...
Compromising the Intelligence of Modern DNNs: On the Effectiveness of Targeted RowPressRanyang Zhou, Jacqueline T. Liu, Sabbir Ahmed, Shaahin Angizi, Adnan Siraj Rakin2024-12-03下载Recent advancements in side-channel attacks have revealed the vulnerability of modern Deep Neural Networks (DNNs) to malicious adversarial weight attacks.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
GoldFish: Serverless Actors with Short-Term Memory State for the Edge-Cloud ContinuumCynthia Marcelino, Jack Shahhoud, Stefan Nastic2024-12-03下载Serverless Computing is a computing paradigm that provides efficient infrastructure management and elastic scalability. Serverless functions scale up or down based on demand, which means that function...
QPET: A Versatile and Portable Quantity-of-Interest-Preservation Framework for Error-Bounded Lossy CompressionJinyang Liu, Pu Jiao, Kai Zhao, Xin Liang, Sheng Di, Franck Cappello2024-12-03下载Error-bounded lossy compression has been widely adopted in many scientific domains because it can address the challenges in storing, transferring, and analyzing unprecedented amounts of scientific dat...
Taurus Database: How to be Fast, Available, and Frugal in the CloudAlex Depoutovitch, Chong Chen, Jin Chen, Paul Larson, Shu Lin, Jack Ng, Wenlin Cui, Qiang Liu, Wei Huang, Yong Xiao, Yongjun He2024-12-03下载Using cloud Database as a Service (DBaaS) offerings instead of on-premise deployments is increasingly common. Key advantages include improved availability and scalability at a lower cost than on-premi...
Massimult: A Novel Parallel CPU Architecture Based on Combinator ReductionJurgen Nicklisch-Franken, Ruslan Feizerakhmanov2024-12-03下载The Massimult project aims to design and implement an innovative CPU architecture based on combinator reduction with a novel combinator base and a new abstract machine.
Performance Debugging through Microarchitectural Sensitivity and Causality AnalysisAlban Dutilleul, Hugo Pompougnac, Nicolas Derumigny, Gabriel Rodriguez, Valentin Trophime, Christophe Guillon, Fabrice Rastello2024-12-03下载Modern Out-of-Order (OoO) CPUs are complex systems with many components interleaved in non-trivial ways. Pinpointing performance bottlenecks and understanding the underlying causes of program performa...
Resource-Adaptive Successive Doubling for Hyperparameter Optimization with Large Datasets on High-Performance Computing SystemsMarcel Aach, Rakesh Sarma, Helmut Neukirchen, Morris Riedel, Andreas Lintermann2024-12-03下载On High-Performance Computing (HPC) systems, several hyperparameter configurations can be evaluated in parallel to speed up the Hyperparameter Optimization (HPO) process.
Scalable Analysis of Urban Scaling Laws: Leveraging Cloud Computing to Analyze 21,280 Global CitiesZhenhui Li, Hongwei Zhang, Kan Wu2024-12-03下载Cities play a pivotal role in human development and sustainability, yet studying them presents significant challenges due to the vast scale and complexity of spatial-temporal data.
Learn More by Using Less: Distributed Learning with Energy-Constrained DevicesRoberto Pereira, Cristian J. Vaca-Rubio, Luis Blanco2024-12-03下载Federated Learning (FL) has emerged as a solution for distributed model training across decentralized, privacy-preserving devices, but the different energy capacities of participating devices (system ...
Technical Report on Reinforcement Learning Control on the Lucas-Nülle Inverted PendulumMaximilian Schenke, Shalbus Bukarov2024-12-03下载The discipline of automatic control is making increased use of concepts that originate from the domain of machine learning. Herein, reinforcement learning (RL) takes an elevated role, as it is inheren...
Connecting Large Language Models with Blockchain: Advancing the Evolution of Smart Contracts from Automation to IntelligenceYouquan Xian, Xueying Zeng, Duancheng Xuan, Danping Yang, Chunpei Li, Peng Fan, Peng Liu2024-12-03下载Blockchain smart contracts have catalyzed the development of decentralized applications across various domains, including decentralized finance.
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism TransformationTuowei Wang, Kun Li, Donglin Bai, Fusong Ju, Leo Xia, Ting Cao, Ju Ren, Yaoxue Zhang, Mao Yang2024-12-03下载AI infrastructures, predominantly GPUs, have delivered remarkable performance gains for deep learning. Conversely, scientific computing, exemplified by quantum chemistry systems, suffers from dynamic ...
Thallus: An RDMA-based Columnar Data Transport ProtocolJayjeet Chakraborty, Matthieu Dorier, Philip Carns, Robert Ross, Carlos Maltzahn, Heiner Litz2024-12-03下载The volume of data generated and stored in contemporary global data centers is experiencing exponential growth. This rapid data growth necessitates efficient processing and analysis to extract valuabl...
Towards the efficacy of federated prediction for epidemics on networksChengpeng Fu, Tong Li, Hao Chen, Wen Du, Zhidong He2024-12-03下载Epidemic prediction is of practical significance in public health, enabling early intervention, resource allocation, and strategic planning. However, privacy concerns often hinder the sharing of healt...
Multi-Bin Batching for Increasing LLM Inference ThroughputOzgur Guldogan, Jackson Kunde, Kangwook Lee, Ramtin Pedarsani2024-12-03下载As large language models (LLMs) grow in popularity for their diverse capabilities, improving the efficiency of their inference systems has become increasingly critical.
Simplifying HPC resource selection: A tool for optimizing execution time and cost on AzureMarco A. S. Netto, Wolfgang De Savador, Davide Vanzo2024-12-03下载Azure Cloud offers a wide range of resources for running HPC workloads, requiring users to configure their deployment by selecting VM types, number of VMs, and processes per VM.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
The GAIUS Experience: Powering a Hyperlocal Mobile Web for Communities in Emerging RegionsRohail Asim, Arjuna Sathiaseelan, Arko Chatterjee, Mukund Lal, Yasir Zaki, Lakshmi Subramanian2024-12-03下载Despite increasing mobile Internet penetration in developing regions, mobile users continue to experience a poor web experience due to two key factors: (i) lack of locally relevant content; (ii) poor ...
Revolutionizing QoE-Driven Network Management with Digital Agents in 6GXuemin Shen, Xinyu Huang, Jianzhe Xue, Conghao Zhou, Xiufang Shi, Weihua Zhuang2024-12-03下载In this article, we present a digital agent (DA)-assisted network management framework for future sixth generation (6G) networks considering user quality of experience (QoE).
Wall-Proximity Matters: Understanding the Effect of Device Placement with Respect to the Wall for Indoor Wi-Fi SensingHe Wang, Yunpeng Ge, Ivan Wang-Hei Ho2024-12-03下载Wi-Fi sensing has been extensively explored for various applications, including vital sign monitoring, human activity recognition, indoor localization, and tracking.
Hamiltonian Monte Carlo-Based Near-Optimal MIMO Signal DetectionJunichiro Hagiwara, Toshihiko Nishimura, Takanori Sato, Yasutaka Ogawa, Takeo Ohgane2024-12-03下载Multiple-input multiple-output (MIMO) technology is essential for the optimal functioning of next-generation wireless networks; however, enhancing its signal-detection performance for improved spectra...
Exploring Evolutionary Spectral Clustering for Temporal-Smoothed Clustered Cell-Free NetworkingJunyuan Wang, Tianyao Wu, Ouyang Zhou, Yaping Zhu2024-12-03下载Clustered cell-free networking, which dynamically partitions the whole network into nonoverlapping subnetworks, has been recently proposed to mitigate the cell-edge problem in cellular networks.
Optimizing Age of Information in Internet of Vehicles Over Error-Prone ChannelsCui Zhang, Maoxin Ji, Qiong Wu, Pingyi Fan, Qiang Fan2024-12-03下载In the Internet of Vehicles (IoV), Age of Information (AoI) has become a vital performance metric for evaluating the freshness of information in communication systems.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Thallus: An RDMA-based Columnar Data Transport ProtocolJayjeet Chakraborty, Matthieu Dorier, Philip Carns, Robert Ross, Carlos Maltzahn, Heiner Litz2024-12-03下载The volume of data generated and stored in contemporary global data centers is experiencing exponential growth. This rapid data growth necessitates efficient processing and analysis to extract valuabl...
Retrofitting XoM for Stripped Binaries without Embedded Data RelocationChenke Luo, Jiang Ming, Mengfei Xie, Guojun Peng, Jianming Fu2024-12-03下载In this paper, we present PXoM, a practical technique to seamlessly retrofit XoM into stripped binaries on the x86-64 platform. As handling the mixture of code and data is a well-known challenge for X...

cs.PF - Performance

标题作者发布日期PDF摘要
Massimult: A Novel Parallel CPU Architecture Based on Combinator ReductionJurgen Nicklisch-Franken, Ruslan Feizerakhmanov2024-12-03下载The Massimult project aims to design and implement an innovative CPU architecture based on combinator reduction with a novel combinator base and a new abstract machine.
AI-Driven Resource Allocation Framework for Microservices in Hybrid Cloud PlatformsBiman Barua, M. Shamim Kaiser2024-12-03下载The increasing demand for scalable, efficient resource management in hybrid cloud environments has led to the exploration of AI-driven approaches for dynamic resource allocation.
Performance Debugging through Microarchitectural Sensitivity and Causality AnalysisAlban Dutilleul, Hugo Pompougnac, Nicolas Derumigny, Gabriel Rodriguez, Valentin Trophime, Christophe Guillon, Fabrice Rastello2024-12-03下载Modern Out-of-Order (OoO) CPUs are complex systems with many components interleaved in non-trivial ways. Pinpointing performance bottlenecks and understanding the underlying causes of program performa...
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism TransformationTuowei Wang, Kun Li, Donglin Bai, Fusong Ju, Leo Xia, Ting Cao, Ju Ren, Yaoxue Zhang, Mao Yang2024-12-03下载AI infrastructures, predominantly GPUs, have delivered remarkable performance gains for deep learning. Conversely, scientific computing, exemplified by quantum chemistry systems, suffers from dynamic ...

基于 VitePress 构建