2025-03-20

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Revisiting DRAM Read Disturbance: Identifying Inconsistencies Between Experimental Characterization and Device-Level Studies	Haocong Luo, İsmail Emir Yüksel, Ataberk Olgun, A. Giray Yağlıkçı, Onur Mutlu	2025-03-20	下载	Modern DRAM is vulnerable to read disturbance (e.g., RowHammer and RowPress) that significantly undermines the robust operation of the system.
Design and Implementation of an FPGA-Based Hardware Accelerator for Transformer	Richie Li, Sicheng Chen	2025-03-20	下载	Transformer-based large language models (LLMs) rely heavily on intensive matrix multiplications for attention and feed-forward layers, with the Q, K, and V linear projections in the Multi-Head Self-At...
GauRast: Enhancing GPU Triangle Rasterizers to Accelerate 3D Gaussian Splatting	Sixu Li, Ben Keller, Yingyan Celine Lin, Brucek Khailany	2025-03-20	下载	3D intelligence leverages rich 3D features and stands as a promising frontier in AI, with 3D rendering fundamental to many downstream applications.
A Scalable and Robust Compilation Framework for Emitter-Photonic Graph State	Xiangyu Ren, Yuexun Huang, Zhiding Liang, Antonio Barbalace	2025-03-20	下载	Quantum graph states are critical resources for various quantum algorithms, and also determine essential interconnections in distributed quantum computing.
Explainable AI-Guided Efficient Approximate DNN Generation for Multi-Pod Systolic Arrays	Ayesha Siddique, Khurram Khalil, Khaza Anuarul Hoque	2025-03-20	下载	Approximate deep neural networks (AxDNNs) are promising for enhancing energy efficiency in real-world devices. One of the key contributors behind this enhanced energy efficiency in AxDNNs is the use o...
DSLUT: An Asymmetric LUT and its Automatic Design Flow Based on Practical Functions	Moucheng Yang, Kaixiang Zhu, Lingli Wang, Xuegong Zhou	2025-03-20	下载	The conventional LUT is redundant since practical functions in real-world benchmarks only occupy a small proportion of all the functions. For example, there are only 3881 out of more than $10^{14}$ NP...
ALLMod: Exploring $\underline{\mathbf{A}}$ rea-Efficiency of $\underline{\mathbf{L}}$ UT-based $\underline{\mathbf{L}}$ arge Number $\underline{\mathbf{Mod}}$ ular Reduction via Hybrid Workloads	Fangxin Liu, Haomin Li, Zongwu Wang, Bo Zhang, Mingzhe Zhang, Shoumeng Yan, Li Jiang, Haibing Guan	2025-03-20	下载	Modular arithmetic, particularly modular reduction, is widely used in cryptographic applications such as homomorphic encryption (HE) and zero-knowledge proofs (ZKP).
Physically Grounded Monocular Depth via Nanophotonic Wavefront Prompting	Bingxuan Li, Jiahao Wu, Yuan Xu, Zezheng Zhu, Yunxiang Zhang, Kenneth Chen, Yanqi Liang, Nanfang Yu, Qi Sun	2025-03-20	下载	Depth foundation models offer strong learned priors for 3D perception but lack physical depth cues, leading to ambiguities in metric scale. We introduce a birefringent metalens -- a planar nanophotoni...
CATCH: a Cost Analysis Tool for Co-optimization of chiplet-based Heterogeneous systems	Alexander Graening, Jonti Talukdar, Saptadeep Pal, Krishnendu Chakrabarty, Puneet Gupta	2025-03-20	下载	With the increasing prevalence of chiplet systems in high-performance computing applications, the number of design options has increased dramatically.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
HiAER-Spike: Hardware-Software Co-Design for Large-Scale Reconfigurable Event-Driven Neuromorphic Computing	Gwenevere Frank, Gopabandhu Hota, Keli Wang, Abhinav Uppal, Omowuyi Olajide, Kenneth Yoshimoto, Leif Gibb, Qingbo Wang, Johannes Leugering, Stephen Deiss, Gert Cauwenberghs	2025-03-20	下载	In this work, we present HiAER-Spike, a modular, reconfigurable, event-driven neuromorphic computing platform designed to execute large spiking neural networks with up to 160 million neurons and 40 bi...
Random-sketching Techniques to Enhance the Numerical Stability of Block Orthogonalization Algorithms for s-step GMRES	Ichitaro Yamazaki, Andrew J. Higgins, Erik G. Boman, Daniel B. Szyld	2025-03-20	下载	We integrate random sketching techniques into block orthogonalization schemes needed for s-step GMRES. The resulting block orthogonalization schemes generate the basis vectors whose overall orthogonal...
Flowshop Machine Scheduling: Markov Modeling, Optimal Schedules and Heuristics	Samah A. M. Ghanem	2025-03-20	下载	Flowshop machine scheduling has been of main interest in several applications where the timing of its processes plays a fundamental role in the utilization of system resources.
Graph of Effort: Quantifying Risk of AI Usage for Vulnerability Assessment	Anket Mehra, Andreas Aßmuth, Malte Prieß	2025-03-20	下载	With AI-based software becoming widely available, the risk of exploiting its capabilities, such as high automation and complex pattern recognition, could significantly increase.
A parallel algorithm for the odd two-face shortest k-disjoint path problem	Srijan Chakraborty, Samir Datta	2025-03-20	下载	The shortest Disjoint Path problem (SDPP) requires us to find pairwise vertex disjoint paths between k designated pairs of terminal vertices such that the sum of the path lengths is minimum.
RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility in Autonomous Vehicles	Dawood Wasif, Terrence J. Moore, Jin-Hee Cho	2025-03-20	下载	Autonomous vehicles (AVs) increasingly rely on Federated Learning (FL) to enhance perception models while preserving privacy. However, existing FL frameworks struggle to balance privacy, fairness, and...
Empirical Analysis of Privacy-Fairness-Accuracy Trade-offs in Federated Learning: A Step Towards Responsible AI	Dawood Wasif, Dian Chen, Sindhuja Madabushi, Nithin Alluru, Terrence J. Moore, Jin-Hee Cho	2025-03-20	下载	Federated Learning (FL) enables collaborative model training while preserving data privacy; however, balancing privacy preservation (PP) and fairness poses significant challenges.
Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions	Hadi Amini, Md Jueal Mia, Yasaman Saadati, Ahmed Imteaj, Seyedsina Nabavirazavi, Urmish Thakker, Md Zarif Hossain, Awal Ahmed Fime, S. S. Iyengar	2025-03-20	下载	Language models (LMs) are machine learning models designed to predict linguistic patterns by estimating the probability of word sequences based on large-scale datasets, such as text.
Dispersion is (Almost) Optimal under (A)synchrony	Ajay D. Kshemkalyani, Manish Kumar, Anisur Rahaman Molla, Gokarna Sharma	2025-03-20	下载	The dispersion problem has received much attention recently in the distributed computing literature. In this problem, $k\leq n$ agents placed initially arbitrarily on the nodes of an $n$ -node, $m$ -edg...
The Merit of Simple Policies: Buying Performance With Parallelism and System Architecture	Mert Yildiz, Alexey Rolich, Andrea Baiocchi	2025-03-20	下载	While scheduling and dispatching of computational workloads is a well-investigated subject, only recently has Google provided publicly a vast high-resolution measurement dataset of its cloud workloads...
iDynamics: A Configurable Emulation Framework for Evaluating Microservice Scheduling Policies under Controllable Cloud-Edge Dynamics	Ming Chen, Muhammed Tawfiqul Islam, Maria Rodriguez Read, Rajkumar Buyya	2025-03-20	下载	This paper presents iDynamics, a configurable emulation framework that exposes these dynamics as controllable experimental factors while running real microservice code on a Kubernetes-based cloud-edge...
On the Effectiveness of the 'Follow-the-Sun' Strategy in Mitigating the Carbon Footprint of AI in Cloud Instances	Roberto Vergallo, Luís Cruz, Alessio Errico, Luca Mainetti	2025-03-20	下载	'Follow-the-Sun' (FtS) is a theoretical computational model aimed at minimizing the carbon footprint of computer workloads. It involves dynamically moving workloads to regions with cleaner energy sour...
SPIN: Accelerating Large Language Model Inference with Heterogeneous Speculative Models	Fahao Chen, Peng Li, Tom H. Luan, Zhou Su, Jing Deng	2025-03-20	下载	Speculative decoding has been shown as an effective way to accelerate Large Language Model (LLM) inference by using a Small Speculative Model (SSM) to generate candidate tokens in a so-called speculat...
Prediction of Permissioned Blockchain Performance for Resource Scaling Configurations	Seungwoo Jung, Yeonho Yoo, Gyeongsik Yang, Chuck Yoo	2025-03-20	下载	Blockchain is increasingly offered as blockchain-as-a-service (BaaS) by cloud service providers. However, configuring BaaS appropriately for optimal performance and reliability resorts to try-and-erro...
ATTENTION2D: Communication Efficient Distributed Self-Attention Mechanism	Venmugil Elango	2025-03-20	下载	Transformer-based models have emerged as a leading architecture for natural language processing, natural language generation, and image generation tasks.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Ordered Topological Deep Learning: a Network Modeling Case Study	Guillermo Bernárdez, Miquel Ferriol-Galmés, Carlos Güemes-Palau, Mathilde Papillon, Pere Barlet-Ros, Albert Cabellos-Aparicio, Nina Miolane	2025-03-20	下载	Computer networks are the foundation of modern digital infrastructure, facilitating global communication and data exchange. As demand for reliable high-bandwidth connectivity grows, advanced network m...
Comparative Analysis of Deep Learning Models for Real-World ISP Network Traffic Forecasting	Josef Koumar, Timotej Smoleň, Kamil Jeřábek, Tomáš Čejka	2025-03-20	下载	Accurate network traffic forecasting is essential for Internet Service Providers (ISP) to optimize resources, enhance user experience, and mitigate anomalies.
Distributed Split Computing Using Diffusive Metrics for UAV Swarms	Talip Tolga Sarı, Gökhan Seçinti, Angelo Trotta	2025-03-20	下载	In large-scale UAV swarms, dynamically executing machine learning tasks can pose significant challenges due to network volatility and the heterogeneous resource constraints of each UAV.
PromptMobile: Efficient Promptus for Low Bandwidth Mobile Video Streaming	Liming Liu, Jiangkai Wu, Haoyang Wang, Peiheng Wang, Zongming Guo, Xinggong Zhang	2025-03-20	下载	Traditional video compression algorithms exhibit significant quality degradation at extremely low bitrates. Promptus emerges as a new paradigm for video streaming, substantially cutting down the bandw...
Sustainable Open-Data Management for Field Research: A Cloud-Based Approach in the Underlandscape Project	Augusto Ciuffoletti, Letizia Chiti	2025-03-20	下载	Field-based research projects require a robust suite of ICT services to support data acquisition, documentation, storage, and dissemination. A key challenge lies in ensuring the sustainability of data...
Energy-Efficient Federated Learning and Migration in Digital Twin Edge Networks	Yuzhi Zhou, Yaru Fu, Zheng Shi, Howard H. Yang, Kevin Hung, Yan Zhang	2025-03-20	下载	The digital twin edge network (DITEN) is a significant paradigm in the sixth-generation wireless system (6G) that aims to organize well-developed infrastructures to meet the requirements of evolving a...
Enhancing Physical Layer Security in Cognitive Radio-Enabled NTNs with Beyond Diagonal RIS	Wali Ullah Khan, Chandan Kumar Sheemar, Eva Lagunas, Symeon Chatzinotas	2025-03-20	下载	Beyond diagonal reconfigurable intelligent surfaces (BD-RIS) have emerged as a transformative technology for enhancing wireless communication by intelligently manipulating the propagation environment.
Towards Agentic AI Networking in 6G: A Generative Foundation Model-as-Agent Approach	Yong Xiao, Guangming Shi, Ping Zhang	2025-03-20	下载	The promising potential of AI and network convergence in improving networking performance and enabling new service capabilities has recently attracted significant interest.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Flowshop Machine Scheduling: Markov Modeling, Optimal Schedules and Heuristics	Samah A. M. Ghanem	2025-03-20	下载	Flowshop machine scheduling has been of main interest in several applications where the timing of its processes plays a fundamental role in the utilization of system resources.
A Dataset of Performance Measurements and Alerts from Mozilla (Data Artifact)	Mohamed Bilel Besbes, Diego Elias Costa, Suhaib Mujahid, Gregory Mierzwinski, Marco Castelluccio	2025-03-20	下载	Performance regressions in software systems can lead to significant financial losses and degraded user satisfaction, making their early detection and mitigation critical.