Skip to content

2025-08-12

cs.AR - Architecture

标题作者发布日期PDF摘要
A Limits Study of Memory-side Tiering TelemetryVinicius Petrucci, Felippe Zacarias, David Roberts2025-08-12下载Increasing workload demands and emerging technologies necessitate the use of various memory and storage tiers in computing systems. This paper presents results from a CXL-based Experimental Memory Req...
Ultra Ethernet's Design Principles and Architectural InnovationsTorsten Hoefler, Karen Schramm, Eric Spada, Keith Underwood, Cedell Alexander, Bob Alverson, Paul Bottorff, Adrian Caulfield, Mark Handley, Cathy Huang, Costin Raiciu, Abdul Kabbani, Eugene Opsasnick, Rong Pan, Adee Ran, Rip Sohan2025-08-12下载The recently released Ultra Ethernet (UE) 1.0 specification defines a transformative High-Performance Ethernet standard for future Artificial Intelligence (AI) and High-Performance Computing (HPC) sys...
OISMA: On-the-fly In-memory Stochastic Multiplication Architecture for Matrix-Multiplication WorkloadsShady Agwa, Yihan Pan, Georgios Papandroulidakis, Themis Prodromakis2025-08-12下载Artificial Intelligence models are currently driven by a significant up-scaling of their complexity, with massive matrix multiplication workloads representing the major computational bottleneck.
Improving Chip Design Enablement for Universities in Europe -- A Position PaperLukas Krupp, Ian O'Connor, Luca Benini, Christoph Studer, Joachim Rodrigues, Norbert Wehn2025-08-12下载The semiconductor industry is pivotal to Europe's economy, especially within the industrial and automotive sectors. However, Europe faces a significant shortfall in chip design capabilities, marked by...
CRADLE: Conversational RTL Design Space Exploration with LLM-based Multi-Agent SystemsLukas Krupp, Maximilian Schöffel, Elias Biehl, Norbert Wehn2025-08-12下载This paper presents CRADLE, a conversational framework for design space exploration of RTL designs using LLM-based multi-agent systems. Unlike existing rigid approaches, CRADLE enables user-guided flo...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
P/D-Device: Disaggregated Large Language Model between Cloud and DevicesYibo Jin, Yixu Xu, Yue Chen, Chengbin Wang, Tao Wang, Jiaqi Huang, Rongfei Zhang, Yiming Dong, Yuting Yan, Ke Cheng, Yingjie Zhu, Shulan Wang, Qianqian Tang, Shuaishuai Meng, Guanxin Cheng, Ze Wang, Shuyan Miao, Ketao Wang, Wen Liu, Yifan Yang, Tong Zhang, Anran Wang, Chengzhou Lu, Tiantian Dong, Yongsheng Zhang, Zhe Wang, Hefei Guo, Hongjie Liu, Wei Lu, Zhengyong Zhang2025-08-12下载Serving disaggregated large language models has been widely adopted in industrial practice for enhanced performance. However, too many tokens generated in decoding phase, i.e.
Ultra Ethernet's Design Principles and Architectural InnovationsTorsten Hoefler, Karen Schramm, Eric Spada, Keith Underwood, Cedell Alexander, Bob Alverson, Paul Bottorff, Adrian Caulfield, Mark Handley, Cathy Huang, Costin Raiciu, Abdul Kabbani, Eugene Opsasnick, Rong Pan, Adee Ran, Rip Sohan2025-08-12下载The recently released Ultra Ethernet (UE) 1.0 specification defines a transformative High-Performance Ethernet standard for future Artificial Intelligence (AI) and High-Performance Computing (HPC) sys...
Redactable Blockchains: An OverviewFederico Calandra, Marco Bernardo, Andrea Esposito, Francesco Fabris2025-08-12下载Blockchains are widely recognized for their immutability, which provides robust guarantees of data integrity and transparency. However, this same feature poses significant challenges in real-world sit...
Scalable Graph Indexing using GPUs for Approximate Nearest Neighbor SearchZhonggen Li, Xiangyu Ke, Yifan Zhu, Bocheng Yu, Baihua Zheng, Yunjun Gao2025-08-12下载Approximate nearest neighbor search (ANNS) in high-dimensional vector spaces has a wide range of real-world applications. Numerous methods have been proposed to handle ANNS efficiently, while graph-ba...
Two for One, One for All: Deterministic LDC-based Robust Computation in Congested CliqueKeren Censor-Hillel, Orr Fischer, Ran Gelles, Pedro Soto2025-08-12下载We design a deterministic compiler that makes any computation in the Congested Clique model robust to a constant fraction α<1 of adversarial crash faults.
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language ModelsLingzhe Zhang, Liancheng Fang, Chiming Duan, Minghua He, Leyi Pan, Pei Xiao, Shiyu Huang, Yunpeng Zhai, Xuming Hu, Philip S. Yu, Aiwei Liu2025-08-12下载As text generation has become a core capability of modern Large Language Models (LLMs), it underpins a wide range of downstream applications. However, most existing LLMs rely on autoregressive (AR) ge...
Cluster Topology-Driven Placement of Experts Reduces Network Traffic in MoE InferenceDanil Sivtsov, Aleksandr Katrutsa, Ivan Oseledets2025-08-12下载Efficient deployment of a pre-trained LLM to a cluster with multiple servers is a critical step for providing fast responses to users' queries.
Resource-Aware Aggregation and Sparsification in Heterogeneous Ensemble Federated LearningKeumseo Ryum, Jinu Gong, Joonhyuk Kang2025-08-12下载Federated learning (FL) enables distributed training with private client data, but its convergence is hindered by system heterogeneity under realistic communication scenarios.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
FedJam: Multimodal Federated Learning Framework for Jamming DetectionIoannis Panitsas, Iason Ofeidis, Leandros Tassiulas2025-08-12下载Jamming attacks pose a critical threat to wireless networks, yet existing detection methods remain largely unimodal, centralized and resource-intensive, limiting their performance, scalability, and de...
Dynamic Uncertainty-aware Multimodal Fusion for Outdoor Health MonitoringZihan Fang, Zheng Lin, Senkang Hu, Yihang Tao, Yiqin Deng, Xianhao Chen, Yuguang Fang2025-08-12下载Outdoor health monitoring is essential to detect early abnormal health status for safeguarding human health and safety. Conventional outdoor monitoring relies on static multimodal deep learning framew...
Developing a Transferable Federated Network Intrusion Detection SystemAbu Shafin Mohammad Mahdee Jameel, Shreya Ghosh, Aly El Gamal2025-08-12下载Intrusion Detection Systems (IDS) are a vital part of a network-connected device. In this paper, we develop a deep learning based intrusion detection system that is deployed in a distributed setup acr...
FetFIDS: A Feature Embedding Attention based Federated Network Intrusion Detection AlgorithmShreya Ghosh, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal2025-08-12下载Intrusion Detection Systems (IDS) have an increasingly important role in preventing exploitation of network vulnerabilities by malicious actors.
NEFMind: Parameter-Efficient Fine-Tuning of Open-Source LLMs for Telecom APIs AutomationZainab Khan, Ahmed Hussain, Mukesh Thakur, Arto Hellas, Panos Papadimitratos2025-08-12下载The use of Service-Based Architecture in modern telecommunications has exponentially increased Network Functions (NFs) and Application Programming Interfaces (APIs), creating substantial operational c...
Ultra Ethernet's Design Principles and Architectural InnovationsTorsten Hoefler, Karen Schramm, Eric Spada, Keith Underwood, Cedell Alexander, Bob Alverson, Paul Bottorff, Adrian Caulfield, Mark Handley, Cathy Huang, Costin Raiciu, Abdul Kabbani, Eugene Opsasnick, Rong Pan, Adee Ran, Rip Sohan2025-08-12下载The recently released Ultra Ethernet (UE) 1.0 specification defines a transformative High-Performance Ethernet standard for future Artificial Intelligence (AI) and High-Performance Computing (HPC) sys...
Cluster Topology-Driven Placement of Experts Reduces Network Traffic in MoE InferenceDanil Sivtsov, Aleksandr Katrutsa, Ivan Oseledets2025-08-12下载Efficient deployment of a pre-trained LLM to a cluster with multiple servers is a critical step for providing fast responses to users' queries.
QoE-Aware Service Provision for Mobile AR Rendering: An Agent-Driven ApproachConghao Zhou, Lulu Sun, Xiucheng Wang, Peng Yang, Feng Lyu, Sihan Lu, Xuemin Shen2025-08-12下载Mobile augmented reality (MAR) is envisioned as a key immersive application in 6G, enabling virtual content rendering aligned with the physical environment through device pose estimation.
Traffic Load-Aware Resource Management Strategy for Underwater Wireless Sensor NetworksTong Zhang, Yu Gou, Jun Liu, Jun-Hong Cui2025-08-12下载Underwater Wireless Sensor Networks (UWSNs) represent a promising technology that enables diverse underwater applications through acoustic communication.
LLM-Driven Adaptive 6G-Ready Wireless Body Area Networks: Survey and FrameworkMohammad Jalili Torkamani, Negin Mahmoudi, Kiana Kiashemshaki2025-08-12下载Wireless Body Area Networks (WBANs) enable continuous monitoring of physiological signals for applications ranging from chronic disease management to emergency response.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
A Limits Study of Memory-side Tiering TelemetryVinicius Petrucci, Felippe Zacarias, David Roberts2025-08-12下载Increasing workload demands and emerging technologies necessitate the use of various memory and storage tiers in computing systems. This paper presents results from a CXL-based Experimental Memory Req...
Ultra Ethernet's Design Principles and Architectural InnovationsTorsten Hoefler, Karen Schramm, Eric Spada, Keith Underwood, Cedell Alexander, Bob Alverson, Paul Bottorff, Adrian Caulfield, Mark Handley, Cathy Huang, Costin Raiciu, Abdul Kabbani, Eugene Opsasnick, Rong Pan, Adee Ran, Rip Sohan2025-08-12下载The recently released Ultra Ethernet (UE) 1.0 specification defines a transformative High-Performance Ethernet standard for future Artificial Intelligence (AI) and High-Performance Computing (HPC) sys...

cs.PF - Performance

标题作者发布日期PDF摘要
A Limits Study of Memory-side Tiering TelemetryVinicius Petrucci, Felippe Zacarias, David Roberts2025-08-12下载Increasing workload demands and emerging technologies necessitate the use of various memory and storage tiers in computing systems. This paper presents results from a CXL-based Experimental Memory Req...
Ultra Ethernet's Design Principles and Architectural InnovationsTorsten Hoefler, Karen Schramm, Eric Spada, Keith Underwood, Cedell Alexander, Bob Alverson, Paul Bottorff, Adrian Caulfield, Mark Handley, Cathy Huang, Costin Raiciu, Abdul Kabbani, Eugene Opsasnick, Rong Pan, Adee Ran, Rip Sohan2025-08-12下载The recently released Ultra Ethernet (UE) 1.0 specification defines a transformative High-Performance Ethernet standard for future Artificial Intelligence (AI) and High-Performance Computing (HPC) sys...
OISMA: On-the-fly In-memory Stochastic Multiplication Architecture for Matrix-Multiplication WorkloadsShady Agwa, Yihan Pan, Georgios Papandroulidakis, Themis Prodromakis2025-08-12下载Artificial Intelligence models are currently driven by a significant up-scaling of their complexity, with massive matrix multiplication workloads representing the major computational bottleneck.
Profiling Large Language Model Inference on Apple Silicon: A Quantization PerspectiveAfsara Benazir, Felix Xiaozhu Lin2025-08-12下载A systematic understanding of Apple Silicon is lacking in the current landscape of hardware efficiency; research focus is largely centered on accelerating GPUs for large-scale training or inference on...

基于 VitePress 构建