2025-10-09

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Production-Grade Local LLM Inference on Apple Silicon: A Comparative Study of MLX, MLC-LLM, Ollama, llama.cpp, and PyTorch MPS	Varun Rajesh, Om Jodhpurkar, Pooja Anbuselvan, Mantinder Singh, Ashok Jallepali, Shantanu Godbole, Pradeep Kumar Sharma, Hritvik Shrivastava	2025-10-09	下载	We present a systematic, empirical evaluation of five local large language model (LLM) runtimes on Apple Silicon: MLX, MLC-LLM, llama.cpp, Ollama, and PyTorch MPS.
LOTION: Smoothing the Optimization Landscape for Quantized Training	Mujin Kwun, Depen Morwani, Chloe Huangyuan Su, Stephanie Gil, Nikhil Anand, Sham Kakade	2025-10-09	下载	Optimizing neural networks for quantized objectives is fundamentally challenging because the quantizer is piece-wise constant, yielding zero gradients everywhere except at quantization thresholds wher...
SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference	Hengrui Zhang, Pratyush Patel, August Ning, David Wentzlaff	2025-10-09	下载	Large Language Models (LLMs) have gained popularity in recent years, driving up the demand for inference. LLM inference is composed of two phases with distinct characteristics: a compute-bound prefill...
Fletch: File-System Metadata Caching in Programmable Switches	Qingxiu Liu, Jiazhen Cai, Siyuan Sheng, Yuhui Chen, Lu Tang, Zhirong Shen, Patrick P. C. Lee	2025-10-09	下载	Fast and scalable metadata management across multiple metadata servers is crucial for distributed file systems to handle numerous files and directories.
Efficient Deployment of CNN Models on Multiple In-Memory Computing Units	Eleni Bougioukou, Theodore Antonakopoulos	2025-10-09	下载	In-Memory Computing (IMC) represents a paradigm shift in deep learning acceleration by mitigating data movement bottlenecks and leveraging the inherent parallelism of memory-based computations.
A Scalable FPGA Architecture With Adaptive Memory Utilization for GEMM-Based Operations	Anastasios Petropoulos, Theodore Antonakopoulos	2025-10-09	下载	Deep neural network (DNN) inference relies increasingly on specialized hardware for high computational efficiency. This work introduces a field-programmable gate array (FPGA)-based dynamically configu...
VeriPy -- A New Python-Based Approach for SDR Pipelined/Unrolled Hardware Accelerator Generation	Yuqin Zhao, Linghui Ye, Haihang Xia, Luke Seed, Tiantai Deng	2025-10-09	下载	Software-defined radio (SDR) plays an important role in the communication field by providing a flexible and customized communication system for different purposes according to the needs.
DL-PIM: Improving Data Locality in Processing-in-Memory Systems	Parker Hao Tian, Zahra Yousefijamarani, Alaa Alameldeen	2025-10-09	下载	PIM architectures aim to reduce data transfer costs between processors and memory by integrating processing units within memory layers. Prior PIM architectures have shown potential to improve energy e...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Comparative Performance Analysis of Modern NoSQL Data Technologies: Redis, Aerospike, and Dragonfly	Deep Bodra, Sushil Khairnar	2025-10-09	下载	The rise of distributed applications and cloud computing has created a demand for scalable, high-performance key-value storage systems. This paper presents a performance evaluation of three prominent ...
Maple: A Multi-agent System for Portable Deep Learning across Clusters	Molang Wu, Zhao Zhang	2025-10-09	下载	Training deep learning (DL) models across Graphics Processing Unit (GPU) clusters is technically challenging. One aspect is that users have to compose command lines to adapt to the heterogeneous launc...
Reinforcement Learning-Driven Edge Management for Reliable Multi-view 3D Reconstruction	Motahare Mounesan, Sourya Saha, Houchao Gan, Md. Nurul Absur, Saptarshi Debroy	2025-10-09	下载	Real-time multi-view 3D reconstruction is a mission-critical application for key edge-native use cases, such as fire rescue, where timely and accurate 3D scene modeling enables situational awareness a...
Man-Made Heuristics Are Dead. Long Live Code Generators!	Rohit Dwivedula, Divyanshu Saxena, Aditya Akella, Swarat Chaudhuri, Daehyeok Kim	2025-10-09	下载	Policy design for various systems controllers has conventionally been a manual process, with domain experts carefully tailoring heuristics for the specific instance in which the policy will be deploye...
Are Voters Willing to Collectively Secure Elections? Unraveling a Practical Blockchain Voting System	Zhuolun Li, Haluk Sonmezler, Faiza Shirazi, Febin Shaji, Tymoteusz Mroczkowski, Dexter Lardner, Matthew Alain Camus, Evangelos Pournaras	2025-10-09	下载	Ensuring ballot secrecy is critical for fair and trustworthy electronic voting systems, yet achieving strong secrecy guarantees in decentralized, large-scale elections remains challenging.
SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference	Hengrui Zhang, Pratyush Patel, August Ning, David Wentzlaff	2025-10-09	下载	Large Language Models (LLMs) have gained popularity in recent years, driving up the demand for inference. LLM inference is composed of two phases with distinct characteristics: a compute-bound prefill...
Investigating Matrix Repartitioning to Address the Over- and Undersubscription Challenge for a GPU-based CFD Solver	Gregor Olenik, Marcel Koch, Hartwig Anzt	2025-10-09	下载	Modern high-performance computing (HPC) increasingly relies on GPUs, but integrating GPU acceleration into complex scientific frameworks like OpenFOAM remains a challenge.
DYNAMIX: RL-based Adaptive Batch Size Optimization in Distributed Machine Learning Systems	Yuanjun Dai, Keqiang He, An Wang	2025-10-09	下载	Existing batch size selection approaches in distributed machine learning rely on static allocation or simplistic heuristics that fail to adapt to heterogeneous, dynamic computing environments.
Conceptual Design Report for FAIR Computing	Johan Messchendorp, Mohammad Al-Turany, Volker Friese, Thorsten Kollegger, Bastian Loeher, Jochen Markert, Andrew Mistry, Thomas Neff, Adrian Oeftiger, Michael Papenbrock, Stephane Pietri, Shahab Sanjari, Tobias Stockmanns	2025-10-09	下载	This Conceptual Design Report (CDR) presents the plans of the computing infrastructure for research at FAIR, Darmstadt, Germany. It presents the computing requirements of the various research groups, ...
Energy-Efficient Maximal Independent Sets in Radio Networks	Dominick Banasik, Varsha Dani, Fabien Dufoulon, Aayush Gupta, Thomas P. Hayes, Gopal Pandurangan	2025-10-09	下载	The maximal independent set (MIS) is one of the most fundamental problems in distributed computing, and it has been studied intensively for over four decades.
pyGinkgo: A Sparse Linear Algebra Operator Framework for Python	Keshvi Tuteja, Gregor Olenik, Roman Mishchuk, Yu-Hsiang Tsai, Markus Götz, Achim Streit, Hartwig Anzt, Charlotte Debus	2025-10-09	下载	Sparse linear algebra is a cornerstone of many scientific computing and machine learning applications. Python has become a popular choice for these applications due to its simplicity and ease of use.
Distributed Resource Selection for Self-Organising Cloud-Edge Systems	Quentin Renau, Amjad Ullah, Emma Hart	2025-10-09	下载	This paper presents a distributed resource selection mechanism for diverse cloud-edge environments, enabling dynamic and context-aware allocation of resources to meet the demands of complex distribute...
Towards Energy-Efficient Serverless Computing with Hardware Isolation	Natalie Carl, Tobias Pfandzelter, David Bermbach	2025-10-09	下载	Serverless computing provides just-in-time infrastructure provisioning with rapid elasticity and a finely-grained pricing model. As full control of resource allocation is in the hands of the cloud pro...
A Multi-Simulation Bridge for IoT Digital Twins	Marco Picone, Samuele Burattini, Marco Melloni, Prasad Talasila, Davide Ziglioli, Matteo Martinelli, Nicola Bicocchi, Alessandro Ricci, Peter Gorm Larsen	2025-10-09	下载	The increasing capabilities of Digital Twins (DTs) in the context of the Internet of Things (IoT) and Industrial IoT (IIoT) call for seamless integration with simulation platforms to support system de...
BlockSDN: Towards a High-Performance Blockchain via Software-Defined Cross Networking optimization	Wenyang Jia, Jingjing Wang, Ziwei Yan, Xiangli Peng, Guohui Yuan	2025-10-09	下载	The scalability of blockchain systems is constrained by inefficient P2P broadcasting, as most existing optimizations focus only on the logical layer without considering physical network conditions.
A Semantic Model for Audit of Cloud Engines based on ISO/IEC TR 3445:2022	Morteza Sargolzaei Javan	2025-10-09	下载	Cloud computing has become the foundation of modern digital infrastructure, yet the absence of a unified architectural and compliance framework impedes interoperability, auditability, and robust secur...
When Light Bends to the Collective Will: A Theory and Vision for Adaptive Photonic Scale-up Domains	Vamsi Addanki	2025-10-09	下载	As chip-to-chip silicon photonics gain traction for their bandwidth and energy efficiency, collective communication has emerged as a critical bottleneck in scale-up systems.
From Tokens to Layers: Redefining Stall-Free Scheduling for LLM Serving with Layered Prefill	Gunjun Lee, Jiwon Kim, Jaiyoung Park, Younjoo Lee, Jung Ho Ahn	2025-10-09	下载	Large Language Model (LLM) inference in production must meet stringent service-level objectives for both time-to-first-token (TTFT) and time-between-token (TBT) while maximizing throughput under fixed...
SketchGuard: Scaling Byzantine-Robust Decentralized Federated Learning via Sketch-Based Screening	Murtaza Rangwala, Farag Azzedin, Richard O. Sinnott, Rajkumar Buyya	2025-10-09	下载	Decentralized Federated Learning enables privacy-preserving collaborative training without centralized servers but remains vulnerable to Byzantine attacks.
Decentralised Blockchain Management Through Digital Twins	Georgios Diamantopoulos, Nikos Tziritas, Rami Bahsoon, Georgios Theodoropoulos	2025-10-09	下载	The necessity of blockchain systems to remain decentralised limits current solutions to blockchain governance and dynamic management, forcing a trade-off between control and decentralisation.
Adaptive Execution Scheduler for DataDios SmartDiff	Aryan Poduri	2025-10-09	下载	We present an adaptive scheduler for a single differencing engine (SmartDiff) with two execution modes: (i) in-memory threads and (ii) Dask based parallelism.
FedQS: Optimizing Gradient and Model Aggregation for Semi-Asynchronous Federated Learning	Yunbo Li, Jiaping Gui, Zhihang Deng, Fanchao Meng, Yue Wu	2025-10-09	下载	Federated learning (FL) enables collaborative model training across multiple parties without sharing raw data, with semi-asynchronous FL (SAFL) emerging as a balanced approach between synchronous and ...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Prioritizing Latency with Profit: A DRL-Based Admission Control for 5G Network Slices	Proggya Chakraborty, Aaquib Asrar, Jayasree Sengupta, Sipra Das Bit	2025-10-09	下载	5G networks enable diverse services such as eMBB, URLLC, and mMTC through network slicing, necessitating intelligent admission control and resource allocation to meet stringent QoS requirements while ...
Robust Heuristic Algorithm Design with LLMs	Pantea Karimi, Dany Rouhana, Pooria Namyar, Siva Kesava Reddy Kakarla, Venkat Arun, Behnaz Arzani	2025-10-09	下载	We posit that we can generate more robust and performant heuristics if we augment approaches using LLMs for heuristic design with tools that explain why heuristics underperform and suggestions about h...
Curated Wireless Datasets for Aerial Network Research	Amir Hossein Fahim Raouf, Donggu Lee, Mushfiqur Rahman, Saad Masrur, Gautham Reddy, Cole Dickerson, Md Sharif Hossen, Sergio Vargas Villar, Anıl Gürses, Simran Singh, Sung Joon Maeng, Martins Ezuma, Christopher Roberts, Mohamed Rabeek Sarbudeen, Thomas J. Zajkowski, Magreth Mushi, Ozgur Ozdemir, Ram Asokan, Ismail Guvenc, Mihail L. Sichitiu, Rudra Dutta	2025-10-09	下载	This Review consolidates publicly available aerial wireless measurement datasets collected using AERPAW. We organize signal-level, power-level, and KPI-level datasets under a unified taxonomy, harmoni...
Dynamic Features Adaptation in Networking: Toward Flexible training and Explainable inference	Yannis Belkhiter, Seshu Tirupathi, Giulio Zizzo, Merim Dzaferagic, John D. Kelleher	2025-10-09	下载	As AI becomes a native component of 6G network control, AI models must adapt to continuously changing conditions, including the introduction of new features and measurements driven by multi-vendor dep...
Serv-Drishti: An Interactive Serverless Function Request Simulation Engine and Visualiser	Siddharth Agarwal, Maria A. Rodriguez, Rajkumar Buyya	2025-10-09	下载	The rapid adoption of serverless computing necessitates a deeper understanding of its underlying operational mechanics, particularly concerning request routing, cold starts, function scaling, and reso...
BlockSDN: Towards a High-Performance Blockchain via Software-Defined Cross Networking optimization	Wenyang Jia, Jingjing Wang, Ziwei Yan, Xiangli Peng, Guohui Yuan	2025-10-09	下载	The scalability of blockchain systems is constrained by inefficient P2P broadcasting, as most existing optimizations focus only on the logical layer without considering physical network conditions.
URLLC for 6G Enabled Industry 5.0: A Taxonomy of Architectures, Cross Layer Techniques, and Time Critical Applications	Abdikarim Mohamed Ibrahim, Rosdiadee Nordin, Yahya S. M. Khamayseh, Angela Amphawan, Muhammed Basheer Jasser	2025-10-09	下载	The evolution from Industry 4.0 to Industry 5.0 introduces stringent requirements for ultra reliable low latency communication (URLLC) to support human centric, intelligent, and resilient industrial s...
When Light Bends to the Collective Will: A Theory and Vision for Adaptive Photonic Scale-up Domains	Vamsi Addanki	2025-10-09	下载	As chip-to-chip silicon photonics gain traction for their bandwidth and energy efficiency, collective communication has emerged as a critical bottleneck in scale-up systems.
TDoA-Based Self-Supervised Channel Charting with NLoS Mitigation	Mohsen Ahadi, Omid Esrafilian, Florian Kaltenberger, Adeel Malik	2025-10-09	下载	Channel Charting (CC) has emerged as a promising framework for data-driven radio localization, yet existing approaches often struggle to scale globally and to handle the distortions introduced by non-...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Man-Made Heuristics Are Dead. Long Live Code Generators!	Rohit Dwivedula, Divyanshu Saxena, Aditya Akella, Swarat Chaudhuri, Daehyeok Kim	2025-10-09	下载	Policy design for various systems controllers has conventionally been a manual process, with domain experts carefully tailoring heuristics for the specific instance in which the policy will be deploye...
Rethinking Provenance Completeness with a Learning-Based Linux Scheduler	Jinsong Mao, Benjamin E. Ujcich, Shiqing Ma	2025-10-09	下载	Provenance plays a critical role in maintaining traceability of a system's actions for root cause analysis of security threats and impacts. Provenance collection is often incorporated into the referen...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Prioritizing Latency with Profit: A DRL-Based Admission Control for 5G Network Slices	Proggya Chakraborty, Aaquib Asrar, Jayasree Sengupta, Sipra Das Bit	2025-10-09	下载	5G networks enable diverse services such as eMBB, URLLC, and mMTC through network slicing, necessitating intelligent admission control and resource allocation to meet stringent QoS requirements while ...
pyGinkgo: A Sparse Linear Algebra Operator Framework for Python	Keshvi Tuteja, Gregor Olenik, Roman Mishchuk, Yu-Hsiang Tsai, Markus Götz, Achim Streit, Hartwig Anzt, Charlotte Debus	2025-10-09	下载	Sparse linear algebra is a cornerstone of many scientific computing and machine learning applications. Python has become a popular choice for these applications due to its simplicity and ease of use.

2025-10-09 ​

cs.AR - Architecture ​

cs.DC - Distributed, Parallel, and Cluster Computing ​

cs.NI - Networking and Internet Architecture ​

cs.OS - Operating Systems ​

cs.PF - Performance ​

2025-10-09

cs.AR - Architecture

cs.DC - Distributed, Parallel, and Cluster Computing

cs.NI - Networking and Internet Architecture

cs.OS - Operating Systems

cs.PF - Performance