2025-02-06

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
WaferLLM: Large Language Model Inference at Wafer Scale	Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai	2025-02-06	下载	Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh architecture with large distributed on-chip memory (tens of ...
All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM Devices	Donato Francesco Falcone, Victoria Clerico, Wooseok Choi, Tommaso Stecconi, Folkert Horst, Laura Begon-Lours, Matteo Galetta, Antonio La Porta, Nikhil Garg, Fabien Alibart, Bert Jan Offrein, Valeria Bragaglia	2025-02-06	下载	Analog in-memory computing is an emerging paradigm designed to efficiently accelerate deep neural network workloads. Recent advancements have focused on either inference or training acceleration.
Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration	Endri Taka, Ning-Chi Huang, Chi-Chih Chang, Kai-Chiang Wu, Aman Arora, Diana Marculescu	2025-02-06	下载	FPGA architectures have recently been enhanced to meet the substantial computational demands of modern deep neural networks (DNNs). To this end, both FPGA vendors and academic researchers have propose...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Light Virtualization: a proof-of-concept for hardware-based virtualization	Francesco Ciraolo, Mattia Nicolella, Denis Hoornaert, Marco Caccamo, Renato Mancuso	2025-02-06	下载	Virtualization has become widespread across all computing environments, from edge devices to cloud systems. Its main advantages are resource management through abstraction and improved isolation of pl...
WaferLLM: Large Language Model Inference at Wafer Scale	Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai	2025-02-06	下载	Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh architecture with large distributed on-chip memory (tens of ...
FedOptimus: Optimizing Vertical Federated Learning for Scalability and Efficiency	Nikita Shrivastava, Drishya Uniyal, Bapi Chatterjee	2025-02-06	下载	Federated learning (FL) is a collaborative machine learning paradigm which ensures data privacy by training models across distributed datasets without centralizing sensitive information.
A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone Applications	Lucas Rey, Ana M. Bernardos, Andrzej D. Dobrzycki, David Carramiñana, Luca Bergesio, Juan A. Besada, José Ramón Casar	2025-02-06	下载	Advancements in embedded systems and Artificial Intelligence (AI) have enhanced the capabilities of Unmanned Aircraft Vehicles (UAVs) in computer vision.
IPComp: Interpolation Based Progressive Lossy Compression for Scientific Applications	Zhuoxun Yang, Sheng Di, Longtao Zhang, Ruoyu Li, Ximiao Li, Jiajun Huang, Jinyang Liu, Franck Cappello, Kai Zhao	2025-02-06	下载	Compression is a crucial solution for data reduction in modern scientific applications due to the exponential growth of data from simulations, experiments, and observations.
Non-convex composite federated learning with heterogeneous data	Jiaojiao Zhang, Jiang Hu, Mikael Johansson	2025-02-06	下载	We propose an innovative algorithm for non-convex composite federated learning that decouples the proximal operator evaluation and the communication between server and clients.
Oblivious Robots Under Round Robin: Gathering on Rings	Alfredo Navarra, Francesco Piselli	2025-02-06	下载	Robots with very limited capabilities are placed on the vertices of a graph and are required to move toward a single, common vertex, where they remain stationary once they arrive.
DistrEE: Distributed Early Exit of Deep Neural Network Inference on Edge Devices	Xian Peng, Xin Wu, Lianming Xu, Li Wang, Aiguo Fei	2025-02-06	下载	Distributed DNN inference is becoming increasingly important as the demand for intelligent services at the network edge grows. By leveraging the power of distributed computing, edge devices can perfor...
InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers	Chenchen Shou, Guyue Liu, Hao Nie, Huaiyu Meng, Yu Zhou, Yimin Jiang, Wenqing Lv, Yelong Xu, Yuanwei Lu, Zhang Chen, Yanbo Yu, Yichen Shen, Yibo Zhu, Daxin Jiang	2025-02-06	下载	Scaling Large Language Model (LLM) training relies on multi-dimensional parallelism, where High-Bandwidth Domains (HBDs) are critical for communication-intensive parallelism like Tensor Parallelism.
Exploring Uncore Frequency Scaling for Heterogeneous Computing	Zhong Zheng, Seyfal Sultanov, Michael E. Papka, Zhiling Lan	2025-02-06	下载	High-performance computing (HPC) systems are essential for scientific discovery and engineering innovation. However, their growing power demands pose significant challenges, particularly as systems sc...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Saflo: eBPF-Based MPTCP Scheduler for Mitigating Traffic Analysis Attacks in Cellular Networks	Sangwoo Lee, Liuyi Jin, Radu Stoleru	2025-02-06	下载	This paper presents the \underline{\textbf{saf}}e sub\underline{\textbf{flo}}w (Saflo) eBPF-based multipath TCP (MPTCP) scheduler, designed to mitigate traffic analysis attacks in cellular network...
Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defenses	Mohammed Aledhari, Rehma Razzak, Mohamed Rahouti, Abbas Yazdinejad, Reza M. Parizi, Basheer Qolomany, Mohsen Guizani, Junaid Qadir, Ala Al-Fuqaha	2025-02-06	下载	The advancements in autonomous driving technology, coupled with the growing interest from automotive manufacturers and tech companies, suggest a rising adoption of Connected Autonomous Vehicles (CAVs)...
Technical Report: Generating the WEB-IDS23 Dataset	Eric Lanfer, Dominik Brockmann, Nils Aschenbruck	2025-02-06	下载	Anomaly-based Network Intrusion Detection Systems (NIDS) require correctly labelled, representative and diverse datasets for an accurate evaluation and development.
A Slicing Model for Transport Networks with Traffic Burst Control and QoS Compliance for Traffic Flows	Aitor Encinas-Alonso, Carlos M. Lentisco, Ignacio Soto, Luis Bellido, David Fernandez	2025-02-06	下载	Network slicing has emerged as a key network technology, providing network operators with the means to offer virtual networks to vertical users over a single physical network infrastructure.
InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers	Chenchen Shou, Guyue Liu, Hao Nie, Huaiyu Meng, Yu Zhou, Yimin Jiang, Wenqing Lv, Yelong Xu, Yuanwei Lu, Zhang Chen, Yanbo Yu, Yichen Shen, Yibo Zhu, Daxin Jiang	2025-02-06	下载	Scaling Large Language Model (LLM) training relies on multi-dimensional parallelism, where High-Bandwidth Domains (HBDs) are critical for communication-intensive parallelism like Tensor Parallelism.