2025-12-11

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
HLS4PC: A Parametrizable Framework For Accelerating Point-Based 3D Point Cloud Models on FPGA	Amur Saqib Pal, Muhammad Mohsin Ghaffar, Faisal Shafait, Christian Weis, Norbert Wehn	2025-12-11	下载	Point-based 3D point cloud models employ computation and memory intensive mapping functions alongside NN layers for classification/segmentation, and are executed on server-grade GPUs.
Design Space Exploration of DMA based Finer-Grain Compute Communication Overlap	Shagnik Pal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam, Lizy K. John	2025-12-11	下载	As both ML training and inference are increasingly distributed, parallelization techniques that shard (divide) ML model across GPUs of a distributed system, are often deployed.
SemanticBBV: A Semantic Signature for Cross-Program Knowledge Reuse in Microarchitecture Simulation	Zhenguo Liu, Chengao Shi, Chen Ding, Jiang Xu	2025-12-11	下载	For decades, sampling-based techniques have been the de facto standard for accelerating microarchitecture simulation, with the Basic Block Vector (BBV) serving as the cornerstone program representatio...
Neuromorphic Processor Employing FPGA Technology with Universal Interconnections	Pracheta Harlikar, Abdel-Hameed A. Badawy, Prasanna Date	2025-12-11	下载	Neuromorphic computing, inspired by biological neural systems, holds immense promise for ultra-low-power and real-time inference applications.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
An LLVM-Based Optimization Pipeline for SPDZ	Tianye Dai, Hammurabi Mendes, Heuichan Lim	2025-12-11	下载	Actively secure arithmetic MPC is now practical for real applications, but performance and usability are still limited by framework-specific compilation stacks, the need for programmers to explicitly ...
HLS4PC: A Parametrizable Framework For Accelerating Point-Based 3D Point Cloud Models on FPGA	Amur Saqib Pal, Muhammad Mohsin Ghaffar, Faisal Shafait, Christian Weis, Norbert Wehn	2025-12-11	下载	Point-based 3D point cloud models employ computation and memory intensive mapping functions alongside NN layers for classification/segmentation, and are executed on server-grade GPUs.
TriHaRd: Higher Resilience for TEE Trusted Time	Matthieu Bettinger, Sonia Ben Mokhtar, Pascal Felber, Etienne Rivière, Valerio Schiavoni, Anthony Simonet-Boulogne	2025-12-11	下载	Accurately measuring time passing is critical for many applications. However, in Trusted Execution Environments (TEEs) such as Intel SGX, the time source is outside the Trusted Computing Base: a malic...
A Proof of Success and Reward Distribution Protocol for Multi-bridge Architecture in Cross-chain Communication	Damilare Peter Oyinloye, Mohd Sameen Chishti, Jingyue Li	2025-12-11	下载	Single-bridge blockchain solutions enable cross-chain communication. However, they are associated with centralization and single-point-of-failure risks.
ESS: An Offload-Centric Latent-Cache Management Architecture for DeepSeek-V3.2-Exp	Xinhang Chen, Chao Zhang, Jiahuan He, Wei Liu, Jianming Zhang, Wenlong Zhou, Xiao Li, Pai Zeng, Shiyong Li, Yuanpan Qian, Dong Li, Zhaogeng Li	2025-12-11	下载	DeepSeek-V3.2-Exp introduces a sparse attention mechanism that significantly reduces inference latency in long-context scenarios. Although the overall throughput has improved greatly, the Decode-stage...
Clustered Federated Learning with Hierarchical Knowledge Distillation	Sabtain Ahmad, Meerzhan Kanatbekova, Ivona Brandic, Atakan Aral	2025-12-11	下载	Clustered Federated Learning (CFL) has emerged as a powerful approach for addressing data heterogeneity and ensuring privacy in large distributed IoT environments.
Differential Privacy for Secure Machine Learning in Healthcare IoT-Cloud Systems	N Mangala, Murtaza Rangwala, S Aishwarya, B Eswara Reddy, Rajkumar Buyya, KR Venugopal, SS Iyengar, LM Patnaik	2025-12-11	下载	Healthcare has become exceptionally sophisticated, as wearables and connected medical devices revolutionize remote patient monitoring, emergency response, medication management, diagnosis, and predict...
Making Wide Stripes Practical: Cascaded Parity LRCs for Efficient Repair and High Reliability	Fan Yu, Guodong Li, Si Wu, Weijun Fang, Sihuang Hu	2025-12-11	下载	Erasure coding with wide stripes is increasingly adopted to reduce storage overhead in large-scale storage systems. However, existing Locally Repairable Codes (LRCs) exhibit structural limitations in ...
HybridFlow: Resource-Adaptive Subtask Routing for Efficient Edge-Cloud LLM Inference	Jiangwen Dong, Jiayu Li, Tianhang Zheng, Wanyu Lin	2025-12-11	下载	Edge-cloud collaborative inference is becoming a practical necessity for LLM-powered edge devices: on-device models often cannot afford the required reasoning capability, while cloud-only inference co...
D2M: A Decentralized, Privacy-Preserving, Incentive-Compatible Data Marketplace for Collaborative Learning	Yash Srivastava, Shalin Jain, Sneha Awathare, Nitin Awathare	2025-12-11	下载	The rising demand for collaborative machine learning and data analytics calls for secure and decentralized data sharing frameworks that balance privacy, trust, and incentives.
Bit of a Close Talker: A Practical Guide to Serverless Cloud Co-Location Attacks	Wei Shao, Najmeh Nazari, Behnam Omidi, Setareh Rafatirad, Khaled N. Khasawneh, Houman Homayoun, Chongzhou Fang	2025-12-11	下载	Serverless computing has revolutionized cloud computing by offering users an efficient, cost-effective way to develop and deploy applications without managing infrastructure details.
High-Dimensional Data Processing: Benchmarking Machine Learning and Deep Learning Architectures in Local and Distributed Environments	Julian Rodriguez, Piotr Lopez, Emiliano Lerma, Rafael Medrano, Jacobo Hernandez	2025-12-11	下载	This document reports the sequence of practices and methodologies implemented during the Big Data course. It details the workflow beginning with the processing of the Epsilon dataset through group and...
Hybrid Learning and Optimization-Based Dynamic Scheduling for DL Workloads on Heterogeneous GPU Clusters	Shruti Dongare, Redwan Ibne Seraj Khan, Hadeel Albahar, Nannan Zhao, Diego Melendez Maita, Ali R. Butt	2025-12-11	下载	Modern cloud platforms increasingly host large-scale deep learning (DL) workloads, demanding high-throughput, low-latency GPU scheduling. However, the growing heterogeneity of GPU clusters and limited...
SlimEdge: Performance and Device Aware Distributed DNN Deployment on Resource-Constrained Edge Hardware	Mahadev Sunil Kumar, Arnab Raha, Debayan Das, Gopakumar G, Rounak Chatterjee, Amitava Mukherjee	2025-12-11	下载	Distributed deep neural networks (DNNs) have become central to modern computer vision, yet their deployment on resource-constrained edge devices remains hindered by substantial parameter counts, compu...
Design Space Exploration of DMA based Finer-Grain Compute Communication Overlap	Shagnik Pal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam, Lizy K. John	2025-12-11	下载	As both ML training and inference are increasingly distributed, parallelization techniques that shard (divide) ML model across GPUs of a distributed system, are often deployed.
SoDA: An Efficient Interaction Paradigm for the Agentic Web	Zicai Cui, Zhouyuan Jian, Weiwen Liu, Weinan Zhang	2025-12-11	下载	As the internet evolves from the mobile App-dominated Attention Economy to the Intent-Interconnection of the Agentic Web era, existing interaction modes fail to address the escalating challenges of da...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
BIER-Star: Stateless Geographic Multicast for Scalable Satellite-Terrestrial Integration	Mostafa Abdollahi, Wenjun Yang, Jianping Pan	2025-12-11	下载	The rapid expansion of LEO satellite constellations has enabled an integrated terrestrial network and non-terrestrial network (TN-NTN), connecting diverse users such as aircraft, ships, and remote com...
SHIFT: Exploring the Boundary of RDMA Network Fault Tolerance	Shengkai Lin, Kairui Zhou, Hongtao Zhang, Yibo Wu, Yi Pan, Yihan Yang, Qinwei Yang, Wei Zhang, Arvind Krishnamurthy, Shizhen Zhao	2025-12-11	下载	Under gang scheduling for large-scale distributed large language model (LLM) training, a single network anomaly can stall or abort an entire job.
A Differentiable Digital Twin of Distributed Link Scheduling for Contention-Aware Networking	Zhongyuan Zhao, Yujun Ming, Kevin Chan, Ananthram Swami, Santiago Segarra	2025-12-11	下载	Many routing and flow optimization problems in wired networks can be solved efficiently using minimum cost flow formulations. However, this approach does not extend to wireless multi-hop networks, whe...
Natural Language Interface for Firewall Configuration	F. Taghiyev, A. Aslanbayli	2025-12-11	下载	This paper presents the design and prototype implementation of a natural language interface for configuring enterprise firewalls. The framework allows administrators to express access control policies...
L2 Ethernet Switch VLSI Implementation	Aniruddh Mishra, Benjamin Oommen, Jimmy Liang	2025-12-11	下载	Ethernet switches are foundational to the global internet infrastructure. These devices route packets of data on a local area network between source addresses to destination media access control addre...