Skip to content

2025-07-28

cs.AR - Architecture

标题作者发布日期PDF摘要
Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU ArchitecturesYashasvi Makin, Rahul Maliakkal2025-07-28下载In particular, large-scale deep learning and artificial intelligence model training uses a lot of computational power and energy, so it poses serious sustainability issues.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
LeMix: Unified Scheduling for LLM Training and Inference on Multi-GPU SystemsYufei Li, Zexin Li, Yinglun Zhu, Cong Liu2025-07-28下载Modern deployment of large language models (LLMs) frequently involves both inference serving and continuous retraining to stay aligned with evolving data and user feedback.
Improving SpGEMM Performance Through Matrix Reordering and Cluster-wise ComputationAbdullah Al Raqibul Islam, Helen Xu, Dong Dai, Aydın Buluç2025-07-28下载Sparse matrix-sparse matrix multiplication (SpGEMM) is a key kernel in many scientific applications and graph workloads. Unfortunately, SpGEMM is bottlenecked by data movement due to its irregular mem...
Accelerating Deterministic Global Optimization via GPU-parallel Interval ArithmeticHongzhen Zhang, Tim Kerkenhoff, Neil Kichler, Manuel Dahmen, Alexander Mitsos, Uwe Naumann, Dominik Bongartz2025-07-28下载Spatial Branch and Bound (B&B) algorithms are widely used for solving nonconvex problems to global optimality, yet they remain computationally expensive.
Advancing Compositional LLM Reasoning with Structured Task Relations in Interactive Multimodal CommunicationsXinye Cao, Hongcan Guo, Guoshun Nan, Jiaoyang Cui, Haoting Qian, Yihan Lin, Yilin Peng, Diyang Zhang, Yanzhao Hou, Huici Wu, Xiaofeng Tao, Tony Q. S. Quek2025-07-28下载Interactive multimodal applications (IMAs), such as route planning in the Internet of Vehicles, enrich users' personalized experiences by integrating various forms of data over wireless networks.
RIMMS: Runtime Integrated Memory Management System for Heterogeneous ComputingSerhan Gener, Aditya Ukarande, Shilpa Mysore Srinivasa Murthy, Sahil Hassan, Joshua Mack, Chaitali Chakrabarti, Umit Ogras, Ali Akoglu2025-07-28下载Efficient memory management in heterogeneous systems is increasingly challenging due to diverse compute architectures (e.g., CPU, GPU, FPGA) and dynamic task mappings not known at compile time.
Sustainable AI Training via Hardware-Software Co-Design on NVIDIA, AMD, and Emerging GPU ArchitecturesYashasvi Makin, Rahul Maliakkal2025-07-28下载In particular, large-scale deep learning and artificial intelligence model training uses a lot of computational power and energy, so it poses serious sustainability issues.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Deep Reinforcement Learning-based Cell DTX/DRX Configuration for Network Energy SavingWei Mao, Lili Wei, Omid Semiari, Shu-ping Yeh, Hosein Nikopour2025-07-28下载3GPP Release 18 cell discontinuous transmission and reception (cell DTX/DRX) is an important new network energy saving feature for 5G. As a time-domain technique, it periodically aggregates the user d...
Load Balancing for AI Training WorkloadsSarah McClure, Evyatar Cohen, Alex Shpiner, Mark Silberstein, Sylvia Ratnasamy, Scott Shenker, Isaac Keslassy2025-07-28下载The extreme bandwidth demands of AI training has made load-balancing a critical component in AI fabrics, and a variety of load-balancing designs have emerged in recent work from both industry and rese...
Towards a Robust Transport Network With Self-adaptive Network Digital TwinCláudio Modesto, João Borges, Cleverson Nahum, Lucas Matni, Cristiano Bonato Both, Kleber Cardoso, Glauco Gonçalves, Ilan Correa, Silvia Lins, Andrey Silva, Aldebaro Klautau2025-07-28下载The ability of the Network digital twin (NDT) to remain aware of changes in its physical counterpart, known as the physical twin (PTwin), is a fundamental condition to enable timely synchronization, a...
Handoff Design in User-Centric Cell-Free Massive MIMO Networks Using DRLHussein A. Ammar, Raviraj Adve, Shahram Shahbazpanahi, Gary Boudreau, Israfil Bahceci2025-07-28下载In the user-centric cell-free massive MIMO (UC-mMIMO) network scheme, user mobility necessitates updating the set of serving access points to maintain the user-centric clustering.
FedABC: Attention-Based Client Selection for Federated Learning with Long-Term ViewWenxuan Ye, Xueli An, Junfan Wang, Xueqiang Yan, Georg Carle2025-07-28下载Native AI support is a key objective in the evolution of 6G networks, with Federated Learning (FL) emerging as a promising paradigm. FL allows decentralized clients to collaboratively train an AI mode...
Collusion Resistant DNS With Private Information RetrievalYunming Xiao, Peizhi Liu, Ruijie Yu, Chenkai Weng, Matteo Varvello, Aleksandar Kuzmanovic2025-07-28下载There has been a growing interest in Internet user privacy, demonstrated by the popularity of privacy-preserving products such as Telegram and Brave, and the widespread adoption of HTTPS.
Curved Apertures for Customized Wave Trajectories: Beyond Flat Aperture LimitationsJoan Martínez Canals, Francesco Devoti, Vincenzo Sciancalepore, Marco Di Renzo, Xavier Costa-Pérez2025-07-28下载Beam shaping techniques enable tailored beam trajectories, offering unprecedented connectivity opportunities in wireless communications. Current approaches rely on flat apertures, which limit trajecto...
A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach for UAV-Assisted Vehicular Networks with Delayed CSI FeedbackZhang Liu, Lianfen Huang, Zhibin Gao, Xianbin Wang, Dusit Niyato, Xuemin, Shen2025-07-28下载Low altitude uncrewed aerial vehicles (UAVs) are expected to facilitate the development of aerial-ground integrated intelligent transportation systems and unlocking the potential of the emerging low-a...
DD-JSCC: Dynamic Deep Joint Source-Channel Coding for Semantic CommunicationsAvi Deb Raha, Apurba Adhikary, Mrityunjoy Gain, Yumin Park, Walid Saad, Choong Seon Hong2025-07-28下载Deep Joint Source-Channel Coding (Deep-JSCC) has emerged as a promising semantic communication approach for wireless image transmission by jointly optimizing source and channel coding using deep learn...
RadioMamba: Breaking the Accuracy-Efficiency Trade-off in Radio Map Construction via a Hybrid Mamba-UNetHonggang Jia, Nan Cheng, Xiucheng Wang, Conghao Zhou, Ruijin Sun, Xuemin, Shen2025-07-28下载Radio map (RM) has recently attracted much attention since it can provide real-time and accurate spatial channel information for 6G services and applications.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Locked In, Leaked Out: Measuring Isolation via Kernel LocksAnjali, Michael M. Swift2025-07-28下载Isolation is a critical property for shared infrastructure to limit exposure and interference among simultaneous running workloads. Cloud providers use different isolation mechanisms such as full Virt...

基于 VitePress 构建