Skip to content

2025-02-06

cs.AR - Architecture

标题作者发布日期PDF摘要
WaferLLM: Large Language Model Inference at Wafer ScaleCongjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai2025-02-06下载Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh architecture with large distributed on-chip memory (tens of ...
All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM DevicesDonato Francesco Falcone, Victoria Clerico, Wooseok Choi, Tommaso Stecconi, Folkert Horst, Laura Begon-Lours, Matteo Galetta, Antonio La Porta, Nikhil Garg, Fabien Alibart, Bert Jan Offrein, Valeria Bragaglia2025-02-06下载Analog in-memory computing is an emerging paradigm designed to efficiently accelerate deep neural network workloads. Recent advancements have focused on either inference or training acceleration.
Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI AccelerationEndri Taka, Ning-Chi Huang, Chi-Chih Chang, Kai-Chiang Wu, Aman Arora, Diana Marculescu2025-02-06下载FPGA architectures have recently been enhanced to meet the substantial computational demands of modern deep neural networks (DNNs). To this end, both FPGA vendors and academic researchers have propose...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Light Virtualization: a proof-of-concept for hardware-based virtualizationFrancesco Ciraolo, Mattia Nicolella, Denis Hoornaert, Marco Caccamo, Renato Mancuso2025-02-06下载Virtualization has become widespread across all computing environments, from edge devices to cloud systems. Its main advantages are resource management through abstraction and improved isolation of pl...
WaferLLM: Large Language Model Inference at Wafer ScaleCongjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai2025-02-06下载Emerging AI accelerators increasingly adopt wafer-scale manufacturing technologies, integrating hundreds of thousands of AI cores in a mesh architecture with large distributed on-chip memory (tens of ...
FedOptimus: Optimizing Vertical Federated Learning for Scalability and EfficiencyNikita Shrivastava, Drishya Uniyal, Bapi Chatterjee2025-02-06下载Federated learning (FL) is a collaborative machine learning paradigm which ensures data privacy by training models across distributed datasets without centralizing sensitive information.
A Performance Analysis of You Only Look Once Models for Deployment on Constrained Computational Edge Devices in Drone ApplicationsLucas Rey, Ana M. Bernardos, Andrzej D. Dobrzycki, David Carramiñana, Luca Bergesio, Juan A. Besada, José Ramón Casar2025-02-06下载Advancements in embedded systems and Artificial Intelligence (AI) have enhanced the capabilities of Unmanned Aircraft Vehicles (UAVs) in computer vision.
IPComp: Interpolation Based Progressive Lossy Compression for Scientific ApplicationsZhuoxun Yang, Sheng Di, Longtao Zhang, Ruoyu Li, Ximiao Li, Jiajun Huang, Jinyang Liu, Franck Cappello, Kai Zhao2025-02-06下载Compression is a crucial solution for data reduction in modern scientific applications due to the exponential growth of data from simulations, experiments, and observations.
Non-convex composite federated learning with heterogeneous dataJiaojiao Zhang, Jiang Hu, Mikael Johansson2025-02-06下载We propose an innovative algorithm for non-convex composite federated learning that decouples the proximal operator evaluation and the communication between server and clients.
Oblivious Robots Under Round Robin: Gathering on RingsAlfredo Navarra, Francesco Piselli2025-02-06下载Robots with very limited capabilities are placed on the vertices of a graph and are required to move toward a single, common vertex, where they remain stationary once they arrive.
DistrEE: Distributed Early Exit of Deep Neural Network Inference on Edge DevicesXian Peng, Xin Wu, Lianming Xu, Li Wang, Aiguo Fei2025-02-06下载Distributed DNN inference is becoming increasingly important as the demand for intelligent services at the network edge grows. By leveraging the power of distributed computing, edge devices can perfor...
InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching TransceiversChenchen Shou, Guyue Liu, Hao Nie, Huaiyu Meng, Yu Zhou, Yimin Jiang, Wenqing Lv, Yelong Xu, Yuanwei Lu, Zhang Chen, Yanbo Yu, Yichen Shen, Yibo Zhu, Daxin Jiang2025-02-06下载Scaling Large Language Model (LLM) training relies on multi-dimensional parallelism, where High-Bandwidth Domains (HBDs) are critical for communication-intensive parallelism like Tensor Parallelism.
Exploring Uncore Frequency Scaling for Heterogeneous ComputingZhong Zheng, Seyfal Sultanov, Michael E. Papka, Zhiling Lan2025-02-06下载High-performance computing (HPC) systems are essential for scientific discovery and engineering innovation. However, their growing power demands pose significant challenges, particularly as systems sc...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Saflo: eBPF-Based MPTCP Scheduler for Mitigating Traffic Analysis Attacks in Cellular NetworksSangwoo Lee, Liuyi Jin, Radu Stoleru2025-02-06下载This paper presents the \underline{\textbf{saf}}e sub\underline{\textbf{flo}}w (Saflo) eBPF-based multipath TCP (MPTCP) scheduler, designed to mitigate traffic analysis attacks in cellular network...
Safeguarding connected autonomous vehicle communication: Protocols, intra- and inter-vehicular attacks and defensesMohammed Aledhari, Rehma Razzak, Mohamed Rahouti, Abbas Yazdinejad, Reza M. Parizi, Basheer Qolomany, Mohsen Guizani, Junaid Qadir, Ala Al-Fuqaha2025-02-06下载The advancements in autonomous driving technology, coupled with the growing interest from automotive manufacturers and tech companies, suggest a rising adoption of Connected Autonomous Vehicles (CAVs)...
Technical Report: Generating the WEB-IDS23 DatasetEric Lanfer, Dominik Brockmann, Nils Aschenbruck2025-02-06下载Anomaly-based Network Intrusion Detection Systems (NIDS) require correctly labelled, representative and diverse datasets for an accurate evaluation and development.
A Slicing Model for Transport Networks with Traffic Burst Control and QoS Compliance for Traffic FlowsAitor Encinas-Alonso, Carlos M. Lentisco, Ignacio Soto, Luis Bellido, David Fernandez2025-02-06下载Network slicing has emerged as a key network technology, providing network operators with the means to offer virtual networks to vertical users over a single physical network infrastructure.
InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching TransceiversChenchen Shou, Guyue Liu, Hao Nie, Huaiyu Meng, Yu Zhou, Yimin Jiang, Wenqing Lv, Yelong Xu, Yuanwei Lu, Zhang Chen, Yanbo Yu, Yichen Shen, Yibo Zhu, Daxin Jiang2025-02-06下载Scaling Large Language Model (LLM) training relies on multi-dimensional parallelism, where High-Bandwidth Domains (HBDs) are critical for communication-intensive parallelism like Tensor Parallelism.

基于 VitePress 构建