Appearance
2025-01-31
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Theoretical complexity analysis of many-cores on a single chip | Ran Ginosar | 2025-01-31 | 下载 | When a single core is scaled up to m cores occupying the same chip area and executing the same (parallelizable) task, achievable speedup is square-root m, power is reduced by square-root m and energy ... |
| Efficient Read-Port-Count Reduction Schemes for the Centralized Physical Register File in a Superscalar Microprocessor | Denis Los | 2025-01-31 | 下载 | The physical register file supports increasing the execution width and depth of a superscalar microprocessor to exploit more instruction-level parallelism. |
| An All-digital 8.6-nJ/Frame 65-nm Tsetlin Machine Image Classification Accelerator | Svein Anders Tunheim, Yujin Zheng, Lei Jiao, Rishad Shafik, Alex Yakovlev, Ole-Christoffer Granmo | 2025-01-31 | 下载 | We present an all-digital programmable machine learning accelerator chip for image classification, underpinning on the Tsetlin machine (TM) principles. |
| A Tensor-Train Decomposition based Compression of LLMs on Group Vector Systolic Accelerator | Sixiao Huang, Tintin Wang, Ang Li, Ao Shen, Kai Li, Keyao Jiang, Mingqiang Huang, Hao Yu | 2025-01-31 | 下载 | Large language models (LLMs) are both storage-intensive and computation-intensive, posing significant challenges when deployed on resource-constrained hardware. |
| StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign | Michael Wu, Arnab Raha, Deepak A. Mathaikutty, Martin Langhammer, Engin Tunali, Daksha Sharma | 2025-01-31 | 下载 | In this paper, we propose StruM, a novel structured mixed-precision-based deep learning inference method, co-designed with its associated hardware accelerator (DPU), to address the escalating computat... |
| Latch Based Design for Fast Voltage Droop Response | Shreyas Srinivas, Ian W Jones, Goran Panic, Christoph Lenzen | 2025-01-31 | 下载 | We present a latch-based and PLL-free design of the voltage droop correction circuit of Lenzen, Fuegger, Kinali, and Wiederhake\cite{DroopJournal}. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| The Free Termination Property of Queries Over Time | Conor Power, Paraschos Koutris, Joseph M Hellerstein | 2025-01-31 | 下载 | Building on prior work on distributed databases and the CALM Theorem, we define and study the question of free termination: in the absence of distributed coordination, what query properties allow node... |
| BICompFL: Stochastic Federated Learning with Bi-Directional Compression | Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh, Nir Weinberger, Deniz Gündüz | 2025-01-31 | 下载 | We address the prominent communication bottleneck in federated learning (FL). We specifically consider stochastic FL, in which models or compressed model updates are specified by distributions rather ... |
| Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated Learning | Maximilian Egger, Mayank Bakshi, Rawad Bitar | 2025-01-31 | 下载 | We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. |
| Asynchronous Fault-Tolerant Language Decidability for Runtime Verification of Distributed Systems | Armando Castañeda, Gilde Valeria Rodríguez | 2025-01-31 | 下载 | Implementing correct distributed systems is an error-prone task. Runtime Verification (RV) offers a lightweight formal method to improve reliability by monitoring system executions against correctness... |
| Longer Attention Span: Increasing Transformer Context Length with Sparse Graph Processing Techniques | Nathaniel Tomczak, Sanmukh Kuppannagari | 2025-01-31 | 下载 | Transformers have demonstrated great success in numerous domains including natural language processing and bioinformatics. This success stems from the use of the attention mechanism by these models in... |
| JustAct+: A Framework for Auditable Multi-Agent Systems Regulated by Inter-Organisational Policies | Christopher A. Esterhuyse, Tim Müller, L. Thomas van Binsbergen | 2025-01-31 | 下载 | In open multi-agent agent systems that cross organisational boundaries, agent actions must be regulated by complex policies. Consider medical data processing systems, which must observe generic laws (... |
| S-VOTE: Similarity-based Voting for Client Selection in Decentralized Federated Learning | Pedro Miguel Sánchez Sánchez, Enrique Tomás Martínez Beltrán, Chao Feng, Gérôme Bovet, Gregorio Martínez Pérez, Alberto Huertas Celdrán | 2025-01-31 | 下载 | Decentralized Federated Learning (DFL) enables collaborative, privacy-preserving model training without relying on a central server. This decentralized approach reduces bottlenecks and eliminates sing... |
| FL-APU: A Software Architecture to Ease Practical Implementation of Cross-Silo Federated Learning | F. Stricker, J. A. Peregrina, D. Bermbach, C. Zirpins | 2025-01-31 | 下载 | Federated Learning (FL) is an upcoming technology that is increasingly applied in real-world applications. Early applications focused on cross-device scenarios, where many participants with limited re... |
| A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration | Yuchen Hu, Xi Chen, Weidong Liu, Xiaojun Mao | 2025-01-31 | 下载 | Distributed stochastic optimization algorithms can simultaneously process large-scale datasets, significantly accelerating model training. However, their effectiveness is often hindered by the sparsit... |
| CPU vs. GPU for Community Detection: Performance Insights from GVE-Louvain and ν-Louvain | Subhajit Sahu | 2025-01-31 | 下载 | Community detection involves identifying natural divisions in networks, a crucial task for many large-scale applications. This report presents GVE-Louvain, one of the most efficient multicore implemen... |
| BSODiag: A Global Diagnosis Framework for Batch Servers Outage in Large-scale Cloud Infrastructure Systems | Tao Duan, Runqing Chen, Pinghui Wang, Junzhou Zhao, Jiongzhou Liu, Shujie Han, Yi Liu, Fan Xu | 2025-01-31 | 下载 | Cloud infrastructure is the collective term for all physical devices within cloud systems. Failures within the cloud infrastructure system can severely compromise the stability and availability of clo... |
| Continuous-Time Analysis of Federated Averaging | Tom Overman, Diego Klabjan | 2025-01-31 | 下载 | Federated averaging (FedAvg) is a popular algorithm for horizontal federated learning (FL), where samples are gathered across different clients and are not shared with each other or a central server. |
| Infer-EDGE: Dynamic DNN Inference Optimization in 'Just-in-time' Edge-AI Implementations | Motahare Mounesan, Xiaojie Zhang, Saptarshi Debroy | 2025-01-31 | 下载 | Balancing mutually diverging performance metrics, such as end-to-end latency, accuracy, and device energy consumption, is a challenging undertaking for deep neural network (DNN) inference in Just-in-T... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Open RAN Slicing with Quantum Optimization | Patatchona Keyela, Soumaya Cherkaoui | 2025-01-31 | 下载 | RAN slicing technology is a key aspect of the Open RAN paradigm, allowing simultaneous and independent provision of various services such as ultra-reliable low-latency communications (URLLC), enhanced... |
| Characterizing User Behavior: The Interplay Between Mobility Patterns and Mobile Traffic | Anne Josiane Kouam, Aline Carneiro Viana, Mariano G. Beiró, Leo Ferres, Luca Pappalardo | 2025-01-31 | 下载 | Mobile devices have become essential for capturing human activity, and eXtended Data Records (XDRs) offer rich opportunities for detailed user behavior modeling, which is useful for designing personal... |
| Synthetic User Behavior Sequence Generation with Large Language Models for Smart Homes | Zhiyao Xu, Dan Zhao, Qingsong Zou, Jingyu Xiao, Yong Jiang, Zhenhui Yuan, Qing Li | 2025-01-31 | 下载 | In recent years, as smart home systems have become more widespread, security concerns within these environments have become a growing threat. Currently, most smart home security solutions, such as ano... |
| APEX: Automated Parameter Exploration for Low-Power Wireless Protocols | Mohamed Hassaan M. Hydher, Markus Schuss, Olga Saukh, Kay Römer, Carlo Alberto Boano | 2025-01-31 | 下载 | Careful parametrization of networking protocols is crucial to maximize the performance of low-power wireless systems and ensure that stringent application requirements can be met. |
| On Measuring Available Capacity in High-speed Cloud Networks | Ganapathy Raman Madanagopal, Christofer Flinta, Andreas Johnsson, Farnaz Moradi, Daniel Turull | 2025-01-31 | 下载 | Measurement of available path capacity with high accuracy over high-speed links deployed in cloud and transport networks is vital for performance assessment and traffic engineering. |
| Reliability Modeling for Beyond-5G Mission Critical Networks Using Effective Capacity | Anudeep Karnam, Jobish John, Kishor C. Joshi, George Exarchakos, Sonia Heemstra de Groot, Ignas Niemegeers | 2025-01-31 | 下载 | Accurate reliability modeling for ultra-reliable low latency communication (URLLC) and hyper-reliable low latency communication (HRLLC) networks is challenging due to the complex interactions between ... |
| Swift: Rethinking RDMA Control Plane for Elastic Computing | Junxue Zhang, Han Tian, Xinyang Huang, Wenxue Li, Kaiqiang Xu, Dian Shen, Yong Wang, Kai Chen | 2025-01-31 | 下载 | Elastic computing enables dynamic scaling to meet workload demands, and Remote Direct Memory Access (RDMA) enhances this by providing high-throughput, low-latency network communication. |
| Sharing GPUs and Programmable Switches in a Federated Testbed with SHARY | Stefano Salsano, Andrea Mayer, Paolo Lungaroni, Pierpaolo Loreti, Lorenzo Bracciale, Andrea Detti, Marco Orazi, Paolo Giaccone, Fulvio Risso, Alessandro Cornacchia, Carla Fabiana Chiasserini | 2025-01-31 | 下载 | Federated testbeds enable collaborative research by providing access to diverse resources, including computing power, storage, and specialized hardware like GPUs, programmable switches and smart Netwo... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| On Measuring Available Capacity in High-speed Cloud Networks | Ganapathy Raman Madanagopal, Christofer Flinta, Andreas Johnsson, Farnaz Moradi, Daniel Turull | 2025-01-31 | 下载 | Measurement of available path capacity with high accuracy over high-speed links deployed in cloud and transport networks is vital for performance assessment and traffic engineering. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Longer Attention Span: Increasing Transformer Context Length with Sparse Graph Processing Techniques | Nathaniel Tomczak, Sanmukh Kuppannagari | 2025-01-31 | 下载 | Transformers have demonstrated great success in numerous domains including natural language processing and bioinformatics. This success stems from the use of the attention mechanism by these models in... |