Skip to content

2025-04-02

cs.AR - Architecture

标题作者发布日期PDF摘要
PIMDAL: Mitigating the Memory Bottleneck in Data Analytics using a Real Processing-in-Memory SystemManos Frouzakis, Juan Gómez-Luna, Geraldo F. Oliveira, Mohammad Sadrosadati, Onur Mutlu2025-04-02下载Database Management Systems (DBMSs) are crucial for efficient data management and analytics, and are used in several different application domains.
A flexible framework for early power and timing comparison of time-multiplexed CGRA kernel executionsMaxime Henri Aspros, Juan Sapriza, Giovanni Ansaloni, David Atienza2025-04-02下载At the intersection between traditional CPU architectures and more specialized options such as FPGAs or ASICs lies the family of reconfigurable hardware architectures, termed Coarse-Grained Reconfigur...
Efficient Calibration for RRAM-based In-Memory Computing using DoRAWeirong Dong, Kai Zhou, Zhen Kong, Quan Cheng, Junkai Huang, Zhengke Yang, Masanori Hashimoto, Longyang Lin2025-04-02下载Resistive In-Memory Computing (RIMC) offers ultra-efficient computation for edge AI but faces accuracy degradation due to RRAM conductance drift over time.
MERE: Hardware-Software Co-Design for Masking Cache Miss Latency in Embedded ProcessorsDean You, Jieyu Jiang, Xiaoxuan Wang, Yushu Du, Zhihang Tan, Wenbo Xu, Hui Wang, Jiapeng Guan, Zhenyuan Wang, Ran Wei, Shuai Zhao, Zhe Jiang2025-04-02下载Runahead execution is a technique to mask memory latency caused by irregular memory accesses. By pre-executing the application code during occurrences of long-latency operations and prefetching antici...
HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI DevicesSangmin Jeon, Kangju Lee, Kyeongwon Lee, Woojoo Lee2025-04-02下载Processing-in-Memory (PIM) architectures offer promising solutions for efficiently handling AI applications in energy-constrained edge environments.
Versatile silicon integrated photonic processor: a reconfigurable solution for next-generation AI clustersYing Zhu, Yifan Liu, Xinyu Yang, Kailai Liu, Xin Hua, Ming Luo, Jia Liu, Siyao Chang, Shengxiang Zhang, Miao Wu, Zhicheng Wang, Hongguang Zhang, Daigao Chen, Xi Xiao, Shaohua Yu2025-04-02下载The Artificial Intelligence models pose serious challenges in intensive computing and high-bandwidth communication for conventional electronic circuit-based computing clusters.
FireGuard: A Generalized Microarchitecture for Fine-Grained Security Analysis on OoO Superscalar CoresZhe Jiang, Sam Ainsworth, Timothy Jones2025-04-02下载High-performance security guarantees rely on hardware support. Generic programmable support for fine-grained instruction analysis has gained broad interest in the literature as a fundamental building ...
MEEK: Re-thinking Heterogeneous Parallel Error Detection Architecture for Real-World OoO Superscalar ProcessorsZhe Jiang, Minli Liao, Sam Ainsworth, Dean You, Timothy Jones2025-04-02下载Heterogeneous parallel error detection is an approach to achieving fault-tolerant processors, leveraging multiple power-efficient cores to re-execute software originally run on a high-performance core...
GigaAPI for GPU ParallelizationM. Suvarna, O. Tehrani2025-04-02下载GigaAPI is a user-space API that simplifies multi-GPU programming, bridging the gap between the capabilities of parallel GPU systems and the ability of developers to harness their full potential.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
PIMDAL: Mitigating the Memory Bottleneck in Data Analytics using a Real Processing-in-Memory SystemManos Frouzakis, Juan Gómez-Luna, Geraldo F. Oliveira, Mohammad Sadrosadati, Onur Mutlu2025-04-02下载Database Management Systems (DBMSs) are crucial for efficient data management and analytics, and are used in several different application domains.
Efficient Federated Learning Tiny Language Models for Mobile Network Feature PredictionDaniel Becking, Ingo Friese, Karsten Müller, Thomas Buchholz, Mandy Galkow-Schneider, Wojciech Samek, Detlev Marpe2025-04-02下载In telecommunications, Autonomous Networks (ANs) automatically adjust configurations based on specific requirements (e.g., bandwidth) and available resources.
Improved Bounds for Coin Flipping, Leader Election, and Random SelectionEshan Chattopadhyay, Mohit Gurumukhani, Noam Ringach, Rocco Servedio2025-04-02下载Random selection, leader election, and collective coin flipping are fundamental tasks in fault-tolerant distributed computing. We study these problems in the full-information model where despite decad...
Distributed Triangle Detection is Hard in Few RoundsSepehr Assadi, Janani Sundaresan2025-04-02下载In the distributed triangle detection problem, we have an nn-vertex network G=(V,E)G=(V,E) with one player for each vertex of the graph who sees the edges incident on the vertex.
Shared-Memory Hierarchical Process MappingChristian Schulz, Henning Woydt2025-04-02下载Modern large-scale scientific applications consist of thousands to millions of individual tasks. These tasks involve not only computation but also communication with one another.
Satellite Edge Artificial Intelligence with Large Models: Architectures and TechnologiesYuanming Shi, Jingyang Zhu, Chunxiao Jiang, Linling Kuang, Khaled B. Letaief2025-04-02下载Driven by the growing demand for intelligent remote sensing applications, large artificial intelligence (AI) models pre-trained on large-scale unlabeled datasets and fine-tuned for downstream tasks ha...
Approximate Agreement Algorithms for Byzantine Collaborative LearningMélanie Cambus, Darya Melnyk, Tijana Milentijević, Stefan Schmid2025-04-02下载In Byzantine collaborative learning, nn clients in a peer-to-peer network collectively learn a model without sharing their data by exchanging and aggregating stochastic gradient estimates.
Split Federated Learning for Low-Altitude Wireless Networks: Joint Sensing, Communication, Computation, and Control Co-designXiangwang Hou, Xianghe Wang, Jiacheng Wang, Zekai Zhang, Jun Du, Jingjing Wang, Yong Ren2025-04-02下载Unmanned aerial vehicles (UAVs) with integrated sensing, communication, computation and control (ISC3) capabilities have become key enablers of next-generation wireless networks.
Exploiting the Uncertainty of the Longest Paths: Response Time Analysis for Probabilistic DAG TasksYiyang Gao, Shuai Zhao, Boyang Li, Xinwei Fang, Zhiyang Lin, Zhe Jiang, Nan Guan2025-04-02下载Parallel real-time systems (e.g., autonomous driving systems) often contain functionalities with complex dependencies and execution uncertainties, leading to significant timing variability which can b...
Accelerating Blockchain Scalability: New Models for Parallel Transaction Execution in the EVMSouradeep Das, Konpat Preechakul, Jonas Bäumer, Riddhi Patel, Jefferson Jinchuan Li2025-04-02下载As the number of decentralized applications and users on Ethereum grows, the ability of the blockchain to efficiently handle a growing number of transactions becomes increasingly strained.
Age-Aware Partial Gradient Update Strategy for Federated Learning Over the AirRuihao Du, Jiaqi Zhu, Zeshen Li, Howard H. Yang2025-04-02下载Frequent parameter exchanges between clients and the edge server incur substantial communication overhead, posing a critical bottleneck in federated learning (FL).
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism DesignMohan Zhang, Pingzhi Li, Jie Peng, Mufan Qiu, Tianlong Chen2025-04-02下载Mixture-of-Experts (MoE) has successfully scaled up models while maintaining nearly constant computing costs. By employing a gating network to route input tokens, it selectively activates a subset of ...
GigaAPI for GPU ParallelizationM. Suvarna, O. Tehrani2025-04-02下载GigaAPI is a user-space API that simplifies multi-GPU programming, bridging the gap between the capabilities of parallel GPU systems and the ability of developers to harness their full potential.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
FastFlow: Early Yet Robust Network Flow Classification using the Minimal Number of Time-Series PacketsRushi Jayeshkumar Babaria, Minzhao Lyu, Gustavo Batista, Vijay Sivaraman2025-04-02下载Network traffic classification is of great importance for network operators in their daily routines, such as analyzing the usage patterns of multimedia applications and optimizing network configuratio...
Toward a Sustainable Low-Altitude Economy: A Survey of Energy-Efficient RIS-UAV NetworksManzoor Ahmed, Aized Amin Soofi, Feroz Khan, Salman Raza, Wali Ullah Khan, Lina Su, Fang Xu, Zhu Han2025-04-02下载The integration of RIS into UAV networks presents a transformative solution for achieving energy-efficient and reliable communication, particularly within the rapidly expanding low-altitude economy (L...
Asynchronous Traffic Shaping and Redundancy: Avoiding Unbounded Latencies in In-Car NetworksTeresa Lübeck, Philipp Meyer, Timo Häckel, Franz Korf, Thomas C. Schmidt2025-04-02下载Time-Sensitive Networking enhances Ethernet-based In-Vehicle Networks (IVNs) with real-time capabilities. Different traffic shaping algorithms have been proposed for time-critical communication, of wh...
Strengthening Multi-Robot Systems for SAR: Co-Designing Robotics and Communication Towards 6GJuan Bravo-Arrabal, Ricardo Vázquez-Martín, J. J. Fernández-Lozano, Alfonso García-Cerezo2025-04-02下载This paper presents field-tested use cases from Search and Rescue (SAR) missions, highlighting the co-design of mobile robots and communication systems to support Edge-Cloud architectures based on 5G ...
A Deep Incremental Framework for Multi-Service Multi-Modal Devices in NextG AI-RAN SystemsMrityunjoy Gain, Kitae Kim, Avi Deb Raha, Apurba Adhikary, Walid Saad, Zhu Han, Choong Seon Hong2025-04-02下载In this paper, we propose a deep incremental framework for efficient RAN management, introducing the Multi-Service-Modal UE (MSMU) system, which enables a single UE to handle eMBB and uRLLC services s...
Satellite Edge Artificial Intelligence with Large Models: Architectures and TechnologiesYuanming Shi, Jingyang Zhu, Chunxiao Jiang, Linling Kuang, Khaled B. Letaief2025-04-02下载Driven by the growing demand for intelligent remote sensing applications, large artificial intelligence (AI) models pre-trained on large-scale unlabeled datasets and fine-tuned for downstream tasks ha...
Optimization of BLE Broadcast Mode in Offline Finding NetworkL Zhang, C Feng, T Xia2025-04-02下载In the Offline Finding Network(OFN), offline Bluetooth tags broadcast to the surrounding area, the finder devices receiving the broadcast signal and upload location information to the IoT(Internet of ...
Balancing Subjectivity and Objectivity in Network Selection: A Decision-Making Framework Towards Digital TwinsBrahim Mefgouda, Hanen Idoudi, Mohammad Al-Quraan, Ismail Lotfi, Omar Alhussein, Lina Mohjazi, Sami Muhaidat2025-04-02下载Selecting the optimal radio access technology (RAT) during vertical handovers (VHO) in heterogeneous wireless networks (HWNs) is critical. Multi-attribute decision-making (MADM) is the most common app...
The Multifractal IP Address Structure: Physical Explanation and ImplicationsChris Misa, Ram Durairajan, Arpit Gupta, Reza Rejaie, Walter Willinger2025-04-02下载The structure of IP addresses observed in Internet traffic plays a critical role for a wide range of networking problems of current interest. For example, modern network telemetry systems that take ad...

cs.PF - Performance

标题作者发布日期PDF摘要
Enhancing Traffic Sign Recognition On The Performance Based On Yolov8Baba Ibrahim, Zhou Kui2025-04-02下载This paper Traffic sign recognition plays a crucial role in the development of autonomous vehicles and advanced driver-assistance systems (ADAS).
GigaAPI for GPU ParallelizationM. Suvarna, O. Tehrani2025-04-02下载GigaAPI is a user-space API that simplifies multi-GPU programming, bridging the gap between the capabilities of parallel GPU systems and the ability of developers to harness their full potential.

基于 VitePress 构建