Skip to content

2025-11-15

cs.AR - Architecture

标题作者发布日期PDF摘要
Pushing the Memory Bandwidth Wall with CXL-enabled Idle I/O Bandwidth HarvestingDivya Kiran Kadiyala, Alexandros Daglis2025-11-15下载The continual increase of cores on server-grade CPUs raises demands on memory systems, which are constrained by limited off-chip pin and data transfer rate scalability.
Sangam: Chiplet-Based DRAM-PIM Accelerator with CXL Integration for LLM InferencingKhyati Kiyawat, Zhenxing Fan, Yasas Seneviratne, Morteza Baradaran, Akhil Shekar, Zihan Xia, Mingu Kang, Kevin Skadron2025-11-15下载Large Language Models (LLMs) are becoming increasingly data-intensive due to growing model sizes, and they are becoming memory-bound as the context length and, consequently, the key-value (KV) cache s...
eFPE: Design, Implementation, and Evaluation of a Lightweight Format-Preserving Encryption Algorithm for Embedded SystemsNishant Vasantkumar Hegde, Suneesh Bare, K B Ramesh, Aamir Ibrahim2025-11-15下载Resource-constrained embedded systems demand secure yet lightweight data protection, particularly when data formats must be preserved. This paper introduces eFPE (Enhanced Format-Preserving Encryption...
A Digital SRAM-Based Compute-In-Memory Macro for Weight-Stationary Dynamic Matrix Multiplication in Transformer Attention Score ComputationJianyi Yu, Tengxiao Wang, Yuxuan Wang, Xiang Fu, Fei Qiao, Ying Wang, Rui Yuan, Liyuan Liu, Cong Shi2025-11-15下载Compute-in-memory (CIM) techniques are widely employed in energy-efficient artificial intelligent (AI) processors. They alleviate power and latency bottlenecks caused by extensive data movements betwe...
TIMERIPPLE: Accelerating vDiTs by Understanding the Spatio-Temporal Correlations in Latent SpaceWenxuan Miao, Yulin Sun, Aiyue Chen, Jing Lin, Yiwu Yao, Yiming Gan, Jieru Zhao, Jingwen Leng, Mingyi Guo, Yu Feng2025-11-15下载The recent surge in video generation has shown the growing demand for high-quality video synthesis using large vision models. Existing video generation models are predominantly based on the video diff...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
A novel strategy for multi-resource load balancing in agent-based systemsLeszek Sliwko, Aleksander Zgrzywa2025-11-15下载The paper presents a multi-resource load balancing strategy which can be utilised within an agent-based system. This approach can assist system designers in their attempts to optimise the structure fo...
Distributed Seasonal Temporal Pattern MiningVan Ho-Long, Nguyen Ho, Anh-Vu Dinh-Duc, Ha Manh Tran, Ky Trung Nguyen, Tran Dung Pham, Quoc Viet Hung Nguyen2025-11-15下载The explosive growth of IoT-enabled sensors is producing enormous amounts of time series data across many domains, offering valuable opportunities to extract insights through temporal pattern mining.
Combining Serverless and High-Performance Computing Paradigms to support ML Data-Intensive ApplicationsMills Staylor, Arup Kumar Sarker, Gregor von Laszewski, Geoffrey Fox, Yue Cheng, Judy Fox2025-11-15下载Data is found everywhere, from health and human infrastructure to the surge of sensors and the proliferation of internet-connected devices. To meet this challenge, the data engineering field has expan...
PipeDiT: Accelerating Diffusion Transformers in Video Generation with Task Pipelining and Model DecouplingSijie Wang, Qiang Wang, Shaohuai Shi2025-11-15下载Video generation has been advancing rapidly, and diffusion transformer (DiT) based models have demonstrated remark- able capabilities. However, their practical deployment is of- ten hindered by slow i...
Striking the Right Balance between Compute and Copy: Improving LLM Inferencing Under Speculative DecodingArun Ramachandran, Ramaswamy Govindarajan, Murali Annavaram, Prakash Raghavendra, Hossein Entezari Zarch, Lei Gao, Chaoyi Jiang2025-11-15下载With the skyrocketing costs of GPUs and their virtual instances in the cloud, there is a significant desire to use CPUs for large language model (LLM) inference.
A Quick and Exact Method for Distributed Quantile ComputationIvan Cao, Jaromir J. Saloni, David A. G. Harrison2025-11-15下载Quantile computation is a core primitive in large-scale data analytics. In Spark, practitioners typically rely on the Greenwald-Khanna (GK) Sketch, an approximate method.
High-Performance N-Queens Solver on GPU: Iterative DFS with Zero Bank ConflictsGuangchao Yao, Yali Li2025-11-15下载The counting of solutions to the N-Queens problem is a classic NP-complete problem with extremely high computational complexity. As of now, the academic community has rigorously verified the number of...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Joint Optimization of RU Allocation and C-SR in Multi-AP Coordinated Wi-Fi SystemsMd Rahat Hasan, Kazi Ahmed Akbar Munim, Md. Forkan Uddin2025-11-15下载We formulate an optimization problem for joint RU allocation and C-SR to maximize the throughput of a multi-AP coordinated WiFi system. The optimization problem is found to be a non-linear integer pro...
SenseRay-3D: Generalizable and Physics-Informed Framework for End-to-End Indoor Propagation ModelingYu Zheng, Kezhi Wang, Wenji Xi, Gang Yu, Jiming Chen, Jie Zhang2025-11-15下载Modeling indoor radio propagation is crucial for wireless network planning and optimization. However, existing approaches often rely on labor-intensive manual modeling of geometry and material propert...
A Bio-Inspired Leader-based Energy Management System for Drone FleetsRosario Napoli, Antonio Celesti, Massimo Villari, Maria Fazio2025-11-15下载Drones are embedded systems (ES) used across a wide range of fields, from photography to shipments and even during crisis management for searching, rescuing and damage assessment activities.
LithoSeg: A Coarse-to-Fine Framework for High-Precision Lithography SegmentationXinyu He, Botong Zhao, Bingbing Li, Shujing Lyu, Jiwei Shen, Yue Lu2025-11-15下载Accurate segmentation and measurement of lithography scanning electron microscope (SEM) images are crucial for ensuring precise process control, optimizing device performance, and advancing semiconduc...

基于 VitePress 构建