Skip to content

2025-12-11

cs.AR - Architecture

标题作者发布日期PDF摘要
HLS4PC: A Parametrizable Framework For Accelerating Point-Based 3D Point Cloud Models on FPGAAmur Saqib Pal, Muhammad Mohsin Ghaffar, Faisal Shafait, Christian Weis, Norbert Wehn2025-12-11下载Point-based 3D point cloud models employ computation and memory intensive mapping functions alongside NN layers for classification/segmentation, and are executed on server-grade GPUs.
Design Space Exploration of DMA based Finer-Grain Compute Communication OverlapShagnik Pal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam, Lizy K. John2025-12-11下载As both ML training and inference are increasingly distributed, parallelization techniques that shard (divide) ML model across GPUs of a distributed system, are often deployed.
SemanticBBV: A Semantic Signature for Cross-Program Knowledge Reuse in Microarchitecture SimulationZhenguo Liu, Chengao Shi, Chen Ding, Jiang Xu2025-12-11下载For decades, sampling-based techniques have been the de facto standard for accelerating microarchitecture simulation, with the Basic Block Vector (BBV) serving as the cornerstone program representatio...
Neuromorphic Processor Employing FPGA Technology with Universal InterconnectionsPracheta Harlikar, Abdel-Hameed A. Badawy, Prasanna Date2025-12-11下载Neuromorphic computing, inspired by biological neural systems, holds immense promise for ultra-low-power and real-time inference applications.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
An LLVM-Based Optimization Pipeline for SPDZTianye Dai, Hammurabi Mendes, Heuichan Lim2025-12-11下载Actively secure arithmetic MPC is now practical for real applications, but performance and usability are still limited by framework-specific compilation stacks, the need for programmers to explicitly ...
HLS4PC: A Parametrizable Framework For Accelerating Point-Based 3D Point Cloud Models on FPGAAmur Saqib Pal, Muhammad Mohsin Ghaffar, Faisal Shafait, Christian Weis, Norbert Wehn2025-12-11下载Point-based 3D point cloud models employ computation and memory intensive mapping functions alongside NN layers for classification/segmentation, and are executed on server-grade GPUs.
TriHaRd: Higher Resilience for TEE Trusted TimeMatthieu Bettinger, Sonia Ben Mokhtar, Pascal Felber, Etienne Rivière, Valerio Schiavoni, Anthony Simonet-Boulogne2025-12-11下载Accurately measuring time passing is critical for many applications. However, in Trusted Execution Environments (TEEs) such as Intel SGX, the time source is outside the Trusted Computing Base: a malic...
A Proof of Success and Reward Distribution Protocol for Multi-bridge Architecture in Cross-chain CommunicationDamilare Peter Oyinloye, Mohd Sameen Chishti, Jingyue Li2025-12-11下载Single-bridge blockchain solutions enable cross-chain communication. However, they are associated with centralization and single-point-of-failure risks.
ESS: An Offload-Centric Latent-Cache Management Architecture for DeepSeek-V3.2-ExpXinhang Chen, Chao Zhang, Jiahuan He, Wei Liu, Jianming Zhang, Wenlong Zhou, Xiao Li, Pai Zeng, Shiyong Li, Yuanpan Qian, Dong Li, Zhaogeng Li2025-12-11下载DeepSeek-V3.2-Exp introduces a sparse attention mechanism that significantly reduces inference latency in long-context scenarios. Although the overall throughput has improved greatly, the Decode-stage...
Clustered Federated Learning with Hierarchical Knowledge DistillationSabtain Ahmad, Meerzhan Kanatbekova, Ivona Brandic, Atakan Aral2025-12-11下载Clustered Federated Learning (CFL) has emerged as a powerful approach for addressing data heterogeneity and ensuring privacy in large distributed IoT environments.
Differential Privacy for Secure Machine Learning in Healthcare IoT-Cloud SystemsN Mangala, Murtaza Rangwala, S Aishwarya, B Eswara Reddy, Rajkumar Buyya, KR Venugopal, SS Iyengar, LM Patnaik2025-12-11下载Healthcare has become exceptionally sophisticated, as wearables and connected medical devices revolutionize remote patient monitoring, emergency response, medication management, diagnosis, and predict...
Making Wide Stripes Practical: Cascaded Parity LRCs for Efficient Repair and High ReliabilityFan Yu, Guodong Li, Si Wu, Weijun Fang, Sihuang Hu2025-12-11下载Erasure coding with wide stripes is increasingly adopted to reduce storage overhead in large-scale storage systems. However, existing Locally Repairable Codes (LRCs) exhibit structural limitations in ...
HybridFlow: Resource-Adaptive Subtask Routing for Efficient Edge-Cloud LLM InferenceJiangwen Dong, Jiayu Li, Tianhang Zheng, Wanyu Lin2025-12-11下载Edge-cloud collaborative inference is becoming a practical necessity for LLM-powered edge devices: on-device models often cannot afford the required reasoning capability, while cloud-only inference co...
D2M: A Decentralized, Privacy-Preserving, Incentive-Compatible Data Marketplace for Collaborative LearningYash Srivastava, Shalin Jain, Sneha Awathare, Nitin Awathare2025-12-11下载The rising demand for collaborative machine learning and data analytics calls for secure and decentralized data sharing frameworks that balance privacy, trust, and incentives.
Bit of a Close Talker: A Practical Guide to Serverless Cloud Co-Location AttacksWei Shao, Najmeh Nazari, Behnam Omidi, Setareh Rafatirad, Khaled N. Khasawneh, Houman Homayoun, Chongzhou Fang2025-12-11下载Serverless computing has revolutionized cloud computing by offering users an efficient, cost-effective way to develop and deploy applications without managing infrastructure details.
High-Dimensional Data Processing: Benchmarking Machine Learning and Deep Learning Architectures in Local and Distributed EnvironmentsJulian Rodriguez, Piotr Lopez, Emiliano Lerma, Rafael Medrano, Jacobo Hernandez2025-12-11下载This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and...
Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU ClustersShruti Dongare, Redwan Ibne Seraj Khan, Hadeel Albahar, Nannan Zhao, Diego Melendez Maita, Ali R. Butt2025-12-11下载Modern cloud platforms increasingly host large-scale deep learning (DL) workloads, demanding high-throughput, low-latency GPU scheduling. However, the growing heterogeneity of GPU clusters and limited...
SlimEdge: Performance and Device Aware Distributed DNN Deployment on Resource-Constrained Edge HardwareMahadev Sunil Kumar, Arnab Raha, Debayan Das, Gopakumar G, Rounak Chatterjee, Amitava Mukherjee2025-12-11下载Distributed deep neural networks (DNNs) have become central to modern computer vision, yet their deployment on resource-constrained edge devices remains hindered by substantial parameter counts, compu...
Design Space Exploration of DMA based Finer-Grain Compute Communication OverlapShagnik Pal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam, Lizy K. John2025-12-11下载As both ML training and inference are increasingly distributed, parallelization techniques that shard (divide) ML model across GPUs of a distributed system, are often deployed.
SoDA: An Efficient Interaction Paradigm for the Agentic WebZicai Cui, Zhouyuan Jian, Weiwen Liu, Weinan Zhang2025-12-11下载As the internet evolves from the mobile App-dominated Attention Economy to the Intent-Interconnection of the Agentic Web era, existing interaction modes fail to address the escalating challenges of da...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
BIER-Star: Stateless Geographic Multicast for Scalable Satellite-Terrestrial IntegrationMostafa Abdollahi, Wenjun Yang, Jianping Pan2025-12-11下载The rapid expansion of LEO satellite constellations has enabled an integrated terrestrial network and non-terrestrial network (TN-NTN), connecting diverse users such as aircraft, ships, and remote com...
SHIFT: Exploring the Boundary of RDMA Network Fault ToleranceShengkai Lin, Kairui Zhou, Hongtao Zhang, Yibo Wu, Yi Pan, Yihan Yang, Qinwei Yang, Wei Zhang, Arvind Krishnamurthy, Shizhen Zhao2025-12-11下载Under gang scheduling for large-scale distributed large language model (LLM) training, a single network anomaly can stall or abort an entire job.
A Differentiable Digital Twin of Distributed Link Scheduling for Contention-Aware NetworkingZhongyuan Zhao, Yujun Ming, Kevin Chan, Ananthram Swami, Santiago Segarra2025-12-11下载Many routing and flow optimization problems in wired networks can be solved efficiently using minimum cost flow formulations. However, this approach does not extend to wireless multi-hop networks, whe...
Natural Language Interface for Firewall ConfigurationF. Taghiyev, A. Aslanbayli2025-12-11下载This paper presents the design and prototype implementation of a natural language interface for configuring enterprise firewalls. The framework allows administrators to express access control policies...
L2 Ethernet Switch VLSI ImplementationAniruddh Mishra, Benjamin Oommen, Jimmy Liang2025-12-11下载Ethernet switches are foundational to the global internet infrastructure. These devices route packets of data on a local area network between source addresses to destination media access control addre...

基于 VitePress 构建