Skip to content

2025-02-24

cs.AR - Architecture

标题作者发布日期PDF摘要
Optimized Memory System Architecture for VESA VDC-M Decoder with Multi-Slice SupportHannah Yang, Sohyeon Kim, Saeyeon Kim, Jiyoung Lee, Huijin Roh, Ji-Hoon Kim2025-02-24下载Video compression plays a pivotal role in managing and transmitting large-scale display data, particularly given the growing demand for higher resolutions and improved video quality.
LLM Inference Acceleration via Efficient Operation FusionMahsa Salmani, Ilya Soloveychik2025-02-24下载The rapid development of the Transformer-based Large Language Models (LLMs) in recent years has been closely linked to their ever-growing and already enormous sizes.
THOR: A Non-Speculative Value Dependent Timing Side Channel Attack Exploiting Intel AMXFarshad Dizani, Azam Ghanbari, Joshua Kalyanapu, Darsh Asher, Samira Mirbagher Ajorpaz2025-02-24下载The rise of on-chip accelerators signifies a major shift in computing, driven by the growing demands of artificial intelligence (AI) and specialized applications.
Evaluating IOMMU-Based Shared Virtual Addressing for RISC-V Embedded Heterogeneous SoCsCyril Koenig, Enrico Zelioli, Luca Benini2025-02-24下载Embedded heterogeneous systems-on-chip (SoCs) rely on domain-specific hardware accelerators to improve performance and energy efficiency. In particular, programmable multi-core accelerators feature a ...
Analyzing a Two-Tier Disaggregated Memory Protection Scheme Based on Memory ReplicationHaris Volos, Yiannakis Sazeides2025-02-24下载As memory technologies continue to shrink and memory error rates increase, the demand for stronger reliability becomes increasingly critical. Fine-grain memory replication has emerged as an appealing ...
VR-Pipe: Streamlining Hardware Graphics Pipeline for Volume RenderingJunseo Lee, Jaisung Kim, Junyong Park, Jaewoong Sim2025-02-24下载Graphics rendering that builds on machine learning and radiance fields is gaining significant attention due to its outstanding quality and speed in generating photorealistic images from novel viewpoin...
Be CIM or Be Memory: A Dual-mode-aware DNN Compiler for CIM AcceleratorsShixin Zhao, Yuming Li, Bing Li, Yintao He, Mengdi Wang, Yinhe Han, Ying Wang2025-02-24下载Computing-in-memory (CIM) architectures demonstrate superior performance over traditional architectures. To unleash the potential of CIM accelerators, many compilation methods have been proposed, focu...
Make LLM Inference Affordable to Everyone: Augmenting GPU Memory with NDP-DIMMLian Liu, Shixin Zhao, Bing Li, Haimeng Ren, Zhaohui Xu, Mengdi Wang, Xiaowei Li, Yinhe Han, Ying Wang2025-02-24下载The billion-scale Large Language Models (LLMs) need deployment on expensive server-grade GPUs with large-storage HBMs and abundant computation capability.
APINT: A Full-Stack Framework for Acceleration of Privacy-Preserving Inference of Transformers based on Garbled CircuitsHyunjun Cho, Jaeho Jeon, Jaehoon Heo, Joo-Young Kim2025-02-24下载As the importance of Privacy-Preserving Inference of Transformers (PiT) increases, a hybrid protocol that integrates Garbled Circuits (GC) and Homomorphic Encryption (HE) is emerging for its implement...
A Review of Memory Wall for Neuromorphic ComputingDexter Le, Baran Arig, Murat Isik, I. Can Dikmen, Teoman Karadag2025-02-24下载This paper reviews memory technologies used in Field-Programmable Gate Arrays (FPGAs) for neuromorphic computing, a brain-inspired approach transforming artificial intelligence with improved efficienc...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Robust Federated Learning with Global Sensitivity Estimation for Financial Risk ManagementLei Zhao, Lin Cai, Wu-Sheng Lu2025-02-24下载In decentralized financial systems, robust and efficient Federated Learning (FL) is promising to handle diverse client environments and ensure resilience to systemic risks.
Provable Model-Parallel Distributed Principal Component Analysis with Parallel DeflationFangshuo Liao, Wenyi Su, Anastasios Kyrillidis2025-02-24下载We study a distributed Principal Component Analysis (PCA) framework where each worker targets a distinct eigenvector and refines its solution by updating from intermediate solutions provided by peers ...
Order Fairness Evaluation of DAG-based ledgersErwan Mahe, Sara Tucci-Piergiovanni2025-02-24下载Order fairness in distributed ledgers refers to properties that relate the order in which transactions are sent or received to the order in which they are eventually finalized, i.e., totally ordered.
Robust Federated Learning in Unreliable Wireless Networks: A Client Selection ApproachYanmeng Wang, Wenkai Ji, Jian Zhou, Fu Xiao, Tsung-Hui Chang2025-02-24下载Federated learning (FL) has emerged as a promising distributed learning paradigm for training deep neural networks (DNNs) at the wireless edge, but its performance can be severely hindered by unreliab...
Analyzing a Two-Tier Disaggregated Memory Protection Scheme Based on Memory ReplicationHaris Volos, Yiannakis Sazeides2025-02-24下载As memory technologies continue to shrink and memory error rates increase, the demand for stronger reliability becomes increasingly critical. Fine-grain memory replication has emerged as an appealing ...
Revisited Convergence of Dolev et al BFS Spanning Tree AlgorithmKarine Altisen, Marius Bozga2025-02-24下载We provide a constructive proof for the convergence of Dolev et al's BFS spanning tree algorithm running under the general assumption of an unfair daemon.
Towards Enterprise-Ready Computer Using Generalist AgentSami Marreed, Alon Oved, Avi Yaeli, Segev Shlomov, Ido Levy, Offer Akrabi, Aviad Sela, Asaf Adi, Nir Mashkif2025-02-24下载This paper presents our ongoing work toward developing an enterprise-ready Computer Using Generalist Agent (CUGA) system. Our research highlights the evolutionary nature of building agentic systems su...
Can Tensor Cores Benefit Memory-Bound Kernels? (No!)Lingqi Zhang, Jiajun Huang, Sheng Di, Satoshi Matsuoka, Mohamed Wahib2025-02-24下载Tensor cores are specialized processing units within GPUs that have demonstrated significant efficiency gains in compute-bound applications such as Deep Learning Training by accelerating dense matrix ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Goal-Oriented Middleware Filtering at Transport Layer Based on Value of UpdatesPolina Kutsevol, Onur Ayan, Nikolaos Pappas, Wolfgang Kellerer2025-02-24下载This work explores employing the concept of goal-oriented (GO) semantic communication for real-time monitoring and control. Generally, GO communication advocates for the deep integration of applicatio...
+Tour: Recommending personalized itineraries for smart tourismJoão Paulo Esper, Luciano de S. Fraga, Aline C. Viana, Kleber Vieira Cardoso, Sand Luz Correa2025-02-24下载Next-generation touristic services will rely on the advanced mobile networks' high bandwidth and low latency and the Multi-access Edge Computing (MEC) paradigm to provide fully immersive mobile experi...
Qoala: an Application Execution Environment for Quantum Internet NodesBart van der Vecht, Atak Talay Yücel, Hana Jirovská, Stephanie Wehner2025-02-24下载Recently, a first-of-its-kind operating system for programmable quantum network nodes was developed, called QNodeOS. Here, we present an extension of QNodeOS called Qoala, which introduces (1) a unifi...
A Novel Multiple Access Scheme for Heterogeneous Wireless Communications using Symmetry-aware Continual Deep Reinforcement LearningHamidreza Mazandarani, Masoud Shokrnezhad, Tarik Taleb2025-02-24下载The Metaverse holds the potential to revolutionize digital interactions through the establishment of a highly dynamic and immersive virtual realm over wireless communications systems, offering service...
Semantic-Aware Dynamic and Distributed Power Allocation: a Multi-UAV Area Coverage Use CaseHamidreza Mazandarani, Masoud Shokrnezhad, Tarik Taleb2025-02-24下载The advancement towards 6G technology leverages improvements in aerial-terrestrial networking, where one of the critical challenges is the efficient allocation of transmit power.
Toward Agentic AI: Generative Information Retrieval Inspired Intelligent Communications and NetworkingRuichen Zhang, Shunpu Tang, Yinqiu Liu, Dusit Niyato, Zehui Xiong, Sumei Sun, Shiwen Mao, Zhu Han2025-02-24下载The increasing complexity and scale of modern telecommunications networks demand intelligent automation to enhance efficiency, adaptability, and resilience.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Qoala: an Application Execution Environment for Quantum Internet NodesBart van der Vecht, Atak Talay Yücel, Hana Jirovská, Stephanie Wehner2025-02-24下载Recently, a first-of-its-kind operating system for programmable quantum network nodes was developed, called QNodeOS. Here, we present an extension of QNodeOS called Qoala, which introduces (1) a unifi...

cs.PF - Performance

标题作者发布日期PDF摘要
Requirements for Quality Assurance of AI Models for Early Detection of Lung CancerHorst K. Hahn, Matthias S. May, Volker Dicken, Michael Walz, Rainer Eßeling, Bianca Lassen-Schmidt, Robert Rischen, Jens Vogel-Claussen, Konstantin Nikolaou, Jörg Barkhausen2025-02-24下载Lung cancer is the second most common cancer and the leading cause of cancer-related deaths worldwide. Survival largely depends on tumor stage at diagnosis, and early detection with low-dose CT can si...
Can Tensor Cores Benefit Memory-Bound Kernels? (No!)Lingqi Zhang, Jiajun Huang, Sheng Di, Satoshi Matsuoka, Mohamed Wahib2025-02-24下载Tensor cores are specialized processing units within GPUs that have demonstrated significant efficiency gains in compute-bound applications such as Deep Learning Training by accelerating dense matrix ...

基于 VitePress 构建