Appearance
2025-11-05
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| AnaFlow: Agentic LLM-based Workflow for Reasoning-Driven Explainable and Sample-Efficient Analog Circuit Sizing | Mohsen Ahmadzadeh, Kaichang Chen, Georges Gielen | 2025-11-05 | 下载 | Analog/mixed-signal circuits are key for interfacing electronics with the physical world. Their design, however, remains a largely handcrafted process, resulting in long and error-prone design cycles. |
| ML-PCM : Machine Learning Technique for Write Optimization in Phase Change Memory (PCM) | Mahek Desai, Rowena Quinn, Marjan Asadinia | 2025-11-05 | 下载 | As transistor-based memory technologies like dynamic random access memory (DRAM) approach their scalability limits, the need to explore alternative storage solutions becomes increasingly urgent. |
| SMART-WRITE: Adaptive Learning-based Write Energy Optimization for Phase Change Memory | Mahek Desai, Rowena Quinn, Marjan Asadinia | 2025-11-05 | 下载 | As dynamic random access memory (DRAM) and other current transistor-based memories approach their scalability limits, the search for alternative storage methods becomes increasingly urgent. |
| LoRA-Edge: Tensor-Train-Assisted LoRA for Practical CNN Fine-Tuning on Edge Devices | Hyunseok Kwak, Kyeongwon Lee, Jae-Jin Lee, Woojoo Lee | 2025-11-05 | 下载 | On-device fine-tuning of CNNs is essential to withstand domain shift in edge applications such as Human Activity Recognition (HAR), yet full fine-tuning is infeasible under strict memory, compute, and... |
| Design and Optimization of Mixed-Kernel Mixed-Signal SVMs for Flexible Electronics | Florentia Afentaki, Maha Shatta, Konstantinos Balaskas, Georgios Panagopoulos, Georgios Zervakis, Mehdi B. Tahoori | 2025-11-05 | 下载 | Flexible Electronics (FE) have emerged as a promising alternative to silicon-based technologies, offering on-demand low-cost fabrication, conformality, and sustainability. |
| LaMoS: Enabling Efficient Large Number Modular Multiplication through SRAM-based CiM Acceleration | Haomin Li, Fangxin Liu, Chenyang Guan, Zongwu Wang, Li Jiang, Haibing Guan | 2025-11-05 | 下载 | Barrett's algorithm is one of the most widely used methods for performing modular multiplication, a critical nonlinear operation in modern privacy computing techniques such as homomorphic encryption (... |
| Delay Time Characterization on FPGA: A Low Nonlinearity, Picosecond Resolution Time-to-Digital Converter on 16-nm FPGA using Bin Sequence Calibration | Sunwoo Park, Byungkwon Park, Eunsung Kim, Jiwon Yune, Seungho Han, Seunggo Nam | 2025-11-05 | 下载 | We present a Time-to-Digital Converter (TDC) implemented on a 16 nm Xilinx UltraScale Plus FPGA that achieves a resolution of 1.15 ps, RMS precision of 3. |
| An Event-Driven Spiking Compute-In-Memory Macro based on SOT-MRAM | Deyang Yu, Chenchen Liu, Chuanjie Zhang, Xiao Fang, Weisheng Zhao | 2025-11-05 | 下载 | The application of Magnetic Random-Access Memory (MRAM) in computing-in-memory (CIM) has gained significant attention. However, existing designs often suffer from high energy consumption due to their ... |
| SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators | Jonathan Li, Nasim Farahini, Evgenii Iuliugin, Magnus Vesterlund, Christian Häggström, Guangtao Wang, Shubhangi Upasani, Ayush Sachdeva, Rui Li, Faline Fu, Chen Wu, Ayesha Siddiqua, John Long, Tuowen Zhao, Matheen Musaddiq, Håkan Zeffer, Yun Du, Mingran Wang, Qinghua Li, Bo Li, Urmish Thakker, Raghu Prabhakar | 2025-11-05 | 下载 | The proliferation of 100B+ parameter Large Language Models (LLMs) with 100k+ context length support have resulted in increasing demands for on-chip memory to support large KV caches. |
| LogicSparse: Enabling Engine-Free Unstructured Sparsity for Quantised Deep-learning Accelerators | Changhong Li, Biswajit Basu, Shreejith Shanker | 2025-11-05 | 下载 | FPGAs have been shown to be a promising platform for deploying Quantised Neural Networks (QNNs) with high-speed, low-latency, and energy-efficient inference. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms | Arijit Bhattacharjee, Ali TehraniJamsaz, Le Chen, Niranjan Hasabnis, Mihai Capota, Nesreen Ahmed, Ali Jannesari | 2025-11-05 | 下载 | Recent advances in large language models (LLMs) have significantly accelerated progress in code translation, enabling more accurate and efficient transformation across programming languages. |
| A General Input-Dependent Colorless Computability Theorem and Applications to Core-Dependent Adversaries | Yannis Coutouly, Emmanuel Godard | 2025-11-05 | 下载 | Distributed computing tasks can be presented with a triple (\I,\Ou,Δ). The solvability of a colorless task on the Iterated Immediate Snapshot model (IIS) has been characterized by the Colorless Comp... |
| Stone Duality Proofs for Colorless Distributed Computability Theorems | Cameron Calk, Emmanuel Godard | 2025-11-05 | 下载 | We introduce a new topological encoding of executions of round-based, full-information distributed protocols via spectral spaces. Such protocols constitute a model of distributed computations which ar... |
| Investigating the Impact of Isolation on Synchronized Benchmarks | Nils Japke, Furat Hamdan, Diana Baumann, David Bermbach | 2025-11-05 | 下载 | Benchmarking in cloud environments suffers from performance variability from multi-tenant resource contention. Duet benchmarking mitigates this by running two workload versions concurrently on the sam... |
| AnchorTP: Resilient LLM Inference with State-Preserving Elastic Tensor Parallelism | Wendong Xu, Chujie Chen, He Xiao, Kuan Li, Jing Xiong, Chen Zhang, Wenyong Zhou, Chaofan Tao, Yang Bai, Bei Yu, Ngai Wong | 2025-11-05 | 下载 | Large Language Model (LLM) inference services demand exceptionally high availability and low latency, yet multi-GPU Tensor Parallelism (TP) makes them vulnerable to single-GPU failures. |
| Universal Quantum Simulation of 50 Qubits on Europe`s First Exascale Supercomputer Harnessing Its Heterogeneous CPU-GPU Architecture | Hans De Raedt, Jiri Kraus, Andreas Herten, Vrinda Mehta, Mathis Bode, Markus Hrywniak, Kristel Michielsen, Thomas Lippert | 2025-11-05 | 下载 | We have developed a new version of the high-performance Jülich universal quantum computer simulator (JUQCS-50) that leverages key features of the GH200 superchips as used in the JUPITER supercomputer,... |
| UMDAM: A Unified Data Layout and DRAM Address Mapping for Heterogenous NPU-PIM | Hai Huang, Xuhong Qiang, Weisheng Zhao, Chenchen Liu | 2025-11-05 | 下载 | Large Language Models (LLMs) are increasingly deployed on edge devices with Neural Processing Units (NPUs), yet the decode phase remains memory-intensive, limiting performance. |
| Characterising Global Platforms: Centralised, Decentralised, Federated, and Grassroots | Ehud Shapiro | 2025-11-05 | 下载 | Global digital platforms are software systems designed to serve entire populations, with some already serving billions of people. We propose atomic transactions-based multiagent transition systems and... |
| SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators | Jonathan Li, Nasim Farahini, Evgenii Iuliugin, Magnus Vesterlund, Christian Häggström, Guangtao Wang, Shubhangi Upasani, Ayush Sachdeva, Rui Li, Faline Fu, Chen Wu, Ayesha Siddiqua, John Long, Tuowen Zhao, Matheen Musaddiq, Håkan Zeffer, Yun Du, Mingran Wang, Qinghua Li, Bo Li, Urmish Thakker, Raghu Prabhakar | 2025-11-05 | 下载 | The proliferation of 100B+ parameter Large Language Models (LLMs) with 100k+ context length support have resulted in increasing demands for on-chip memory to support large KV caches. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Inter-Agent Trust Models: A Comparative Study of Brief, Claim, Proof, Stake, Reputation and Constraint in Agentic Web Protocol Design-A2A, AP2, ERC-8004, and Beyond | Botao 'Amber' Hu, Helena Rong | 2025-11-05 | 下载 | As the "agentic web" takes shape-billions of AI agents (often LLM-powered) autonomously transacting and collaborating-trust shifts from human oversight to protocol design. |
| Integrity Under Siege: A Rogue gNodeB's Manipulation of 5G Network Slice Allocation | Jiali Xu, Valeria Loscri, Romain Rouvoy | 2025-11-05 | 下载 | The advent of 5G networks, with network slicing as a cornerstone technology, promises customized, high-performance services, but also introduces novel attack surfaces beyond traditional threats. |
| Joint Optimization of DNN Model Caching and Request Routing in Mobile Edge Computing | Shuting Qiu, Fang Dong, Siyu Tan, Ruiting Zhou, Dian Shen, Patrick P. C. Lee, Qilin Fan | 2025-11-05 | 下载 | Mobile edge computing (MEC) can pre-cache deep neural networks (DNNs) near end-users, providing low-latency services and improving users' quality of experience (QoE). |
| Handover Configurations in Operational 5G Networks: Diversity, Evolution, and Impact on Performance | Moinak Ghoshal, Imran Khan, Phuc Dinh, Z. Jonny Kong, Omar Basit, Sizhe Wang, Yufei Feng, Y. Charlie Hu, Dimitrios Koutsonikolas | 2025-11-05 | 下载 | Mobility management in cellular networks, especially the handover (HO) process, plays a key role in providing seamless and ubiquitous Internet access. |
| CRSF: Enabling QoS-Aware Beyond-Connectivity Service Sharing in 6G Local Networks | Pragya Sharma, Amanda Xiang, Abbas Kiani, John Kaippallimalil, Tony Saboorian, Haining Wang | 2025-11-05 | 下载 | Sixth-generation (6G) networks are envisioned to support interconnected local subnetworks that can share specialized, beyond-connectivity services. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms | Arijit Bhattacharjee, Ali TehraniJamsaz, Le Chen, Niranjan Hasabnis, Mihai Capota, Nesreen Ahmed, Ali Jannesari | 2025-11-05 | 下载 | Recent advances in large language models (LLMs) have significantly accelerated progress in code translation, enabling more accurate and efficient transformation across programming languages. |
| One Size Does Not Fit All: Architecture-Aware Adaptive Batch Scheduling with DEBA | François Belias, Naser Ezzati-Jivan, Foutse Khomh | 2025-11-05 | 下载 | Adaptive batch size methods aim to accelerate neural network training, but existing approaches apply identical adaptation strategies across all architectures, assuming a one-size-fits-all solution. |
| PerfDojo: Automated ML Library Generation for Heterogeneous Architectures | Andrei Ivanov, Siyuan Shen, Gioele Gottardo, Marcin Chrapek, Afif Boudaoud, Timo Schneider, Luca Benini, Torsten Hoefler | 2025-11-05 | 下载 | The increasing complexity of machine learning models and the proliferation of diverse hardware architectures (CPUs, GPUs, accelerators) make achieving optimal performance a significant challenge. |
| Exploring Topologies in Quantum Annealing: A Hardware-Aware Perspective | Mario Bifulco, Luca Roversi | 2025-11-05 | 下载 | Quantum Annealing (QA) offers a promising framework for solving NP-hard optimization problems, but its effectiveness is constrained by the topology of the underlying quantum hardware. |