Skip to content

2024-02-01

cs.AR - Architecture

标题作者发布日期PDF摘要
Ultra Fast Transformers on FPGAs for Particle Physics ExperimentsZhixing Jiang, Dennis Yin, Elham E Khoda, Vladimir Loncar, Ekaterina Govorkova, Eric Moreno, Philip Harris, Scott Hauck, Shih-Chieh Hsu2024-02-01下载This work introduces a highly efficient implementation of the transformer architecture on a Field-Programmable Gate Array (FPGA) by using the \texttt{hls4ml} tool.
Improving the Representativeness of Simulation Intervals for the Cache Memory SystemNicolas Bueno, Fernando Castro, Luis Pinuel, Jose Ignacio Gomez-Perez, Francky Catthoor2024-02-01下载Accurate simulation techniques are indispensable to efficiently propose new memory or architectural organizations. As implementing new hardware concepts in real systems is often not feasible, cycle-ac...
Cocco: Hardware-Mapping Co-Exploration towards Memory Capacity-Communication OptimizationZhanhong Tan, Zijian Zhu, Kaisheng Ma2024-02-01下载Memory is a critical design consideration in current data-intensive DNN accelerators, as it profoundly determines energy consumption, bandwidth requirements, and area costs.
Reuse Detector: Improving the Management of STT-RAM SLLCsRoberto RodrÍguez-RodrÍguez, Javier DÍaz, Fernando Castro, Pablo IbÁÑez, Daniel Chaver, Víctor ViÑals, Juan Carlos Saez, Manuel Prieto-Matias, Luis Pinuel, Teresa Monreal, Jose María LlaberÍa2024-02-01下载Various constraints of Static Random Access Memory (SRAM) are leading to consider new memory technologies as candidates for building on-chip shared last-level caches (SLLCs).
Optimization of a Line Detection Algorithm for Autonomous Vehicles on a RISC-V with AcceleratorMaría José Belda, Katzalin Olcoz, Fernando Castro, Francisco Tirado2024-02-01下载In recent years, autonomous vehicles have attracted the attention of many research groups, both in academia and business, including researchers from leading companies such as Google, Uber and Tesla.
ONE-SA: Enabling Nonlinear Operations in Systolic Arrays for Efficient and Flexible Neural Network InferenceRuiqi Sun, Yinchen Ni, Xin He, Jie Zhao, An Zou2024-02-01下载The computation and memory-intensive nature of DNNs limits their use in many mobile and embedded contexts. Application-specific integrated circuit (ASIC) hardware accelerators employ matrix multiplica...
AssertLLM: Generating and Evaluating Hardware Verification Assertions from Design Specifications via Multi-LLMsWenji Fang, Mengming Li, Min Li, Zhiyuan Yan, Shang Liu, Hongce Zhang, Zhiyao Xie2024-02-01下载Assertion-based verification (ABV) is a critical method for ensuring design circuits comply with their architectural specifications, which are typically described in natural language.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Ensuring Data Privacy in AC Optimal Power Flow with a Distributed Co-Simulation FrameworkXinliang Dai, Alexander Kocher, Jovana Kovačević, Burak Dindar, Yuning Jiang, Colin N. Jones, Hüseyin Çakmak, Veit Hagenmeyer2024-02-01下载During the energy transition, the significance of collaborative management among institutions is rising, confronting challenges posed by data privacy concerns.
A Review on Industrial Augmented Reality Systems for the Industry 4.0 ShipyardPaula Fraga-Lamas, Tiago M Fernandez-Carames, Oscar Blanco-Novoa, Miguel Vilar-Montesinos2024-02-01下载Shipbuilding companies are upgrading their inner workings in order to create Shipyards 4.0, where the principles of Industry 4.0 are paving the way to further digitalized and optimized processes in an...
Profiling and Modeling of Power Characteristics of Leadership-Scale HPC System WorkloadsAhmad Maroof Karimi, Naw Safrin Sattar, Woong Shin, Feiyi Wang2024-02-01下载In the exascale era in which application behavior has large power & energy footprints, per-application job-level awareness of such impression is crucial in taking steps towards achieving efficiency go...
Comparative Study of Large Language Model Architectures on FrontierJunqi Yin, Avishek Bose, Guojing Cong, Isaac Lyngaas, Quentin Anthony2024-02-01下载Large language models (LLMs) have garnered significant attention in both the AI community and beyond. Among these, the Generative Pre-trained Transformer (GPT) has emerged as the dominant architecture...
Revising Apetrei's bounding volume hierarchy construction algorithm to allow stackless traversalAndrey Prokopenko, Damien Lebrun-Grandié2024-02-01下载Stackless traversal is a technique to speed up range queries by avoiding usage of a stack during the tree traversal. One way to achieve that is to transform a given binary tree to store a left child a...
Workflow Optimization for Parallel Split LearningJoana Tirana, Dimitra Tsigkari, George Iosifidis, Dimitris Chatzopoulos2024-02-01下载Split learning (SL) has been recently proposed as a way to enable resource-constrained devices to train multi-parameter neural networks (NNs) and participate in federated learning (FL).
Towards a GPU-Parallelization of the neXtSIM-DG Dynamical CoreRobert Jendersie, Christian Lessig, Thomas Richter2024-02-01下载The cryosphere plays a significant role in Earth's climate system. Therefore, an accurate simulation of sea ice is of great importance to improve climate projections.
Reconfigurable Intelligent Computational Surfaces for MEC-Assisted Autonomous Driving NetworksBo Yang, Xueyao Zhang, Zhiwen Yu, Xuelin Cao, Chongwen Huang, George C. Alexandropoulos, Yan Zhang, Merouane Debbah, Chau Yuen2024-02-01下载In this paper, we focus on improving autonomous driving safety via task offloading from cellular vehicles (CVs), using vehicle-to-infrastructure (V2I) links, to an multi-access edge computing (MEC) se...
BlackMamba: Mixture of Experts for State-Space ModelsQuentin Anthony, Yury Tokpanov, Paolo Glorioso, Beren Millidge2024-02-01下载State-space models (SSMs) have recently demonstrated competitive performance to transformers at large-scale language modeling benchmarks while achieving linear time and memory complexity as a function...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Integrating Multi -WAN, VPN and IEEE 802.3ad for Advanced IPSECStefan Ćertić2024-02-01下载Since the emergence of the internet, IPSEC has undergone significant changes due to changes in the type and behavior of users worldwide. IEEE 802.
When to Preempt in a Status Update System?Subhankar Banerjee, Sennur Ulukus2024-02-01下载We consider a time-slotted status update system with an error-free preemptive queue. The goal of the sampler-scheduler pair is to minimize the age of information at the monitor by sampling and transmi...
X-CBA: Explainability Aided CatBoosted Anomal-E for Intrusion Detection SystemKiymet Kaya, Elif Ak, Sumeyye Bas, Berk Canberk, Sule Gunduz Oguducu2024-02-01下载The effectiveness of Intrusion Detection Systems (IDS) is critical in an era where cyber threats are becoming increasingly complex. Machine learning (ML) and deep learning (DL) models provide an effic...
A YANG-aided Unified Strategy for Black Hole Detection for Backbone NetworksElif Ak, Kiymet Kaya, Eren Ozaltun, Sule Gunduz Oguducu, Berk Canberk2024-02-01下载Despite the crucial importance of addressing Black Hole failures in Internet backbone networks, effective detection strategies in backbone networks are lacking.
Intent Assurance using LLMs guided by Intent DriftKristina Dzeparoska, Ali Tizghadam, Alberto Leon-Garcia2024-02-01下载Intent-Based Networking (IBN) presents a paradigm shift for network management, by promising to align intents and business objectives with network operations--in an automated manner.
Cloudy with a Chance of Green: Measuring the Predictability of 18,009 Traffic Lights in HamburgDaniel Jeschor, Philipp Matthes, Thomas Springer, Sebastian Pape, Sven Fröhlich2024-02-01下载Informing drivers about the predicted state of upcoming traffic lights is considered a key solution to reduce unneeded energy expenditure and dilemma zones at intersections.
Deep Learning Approaches for Network Traffic Classification in the Internet of Things (IoT): A SurveyJawad Hussain Kalwar, Sania Bhatti2024-02-01下载The Internet of Things (IoT) has witnessed unprecedented growth, resulting in a massive influx of diverse network traffic from interconnected devices.
Workflow Optimization for Parallel Split LearningJoana Tirana, Dimitra Tsigkari, George Iosifidis, Dimitris Chatzopoulos2024-02-01下载Split learning (SL) has been recently proposed as a way to enable resource-constrained devices to train multi-parameter neural networks (NNs) and participate in federated learning (FL).
Experimental Evaluation of Interactive Edge/Cloud Virtual Reality Gaming over Wi-Fi using Unity Render StreamingMiguel Casasnovas, Costas Michaelides, Marc Carrascosa-Zamacois, Boris Bellalta2024-02-01下载Virtual Reality (VR) streaming enables end-users to seamlessly immerse themselves in interactive virtual environments using even low-end devices.
Develop End-to-End Anomaly Detection SystemEmanuele Mengoli, Zhiyuan Yao, Wutao Wei2024-02-01下载Anomaly detection plays a crucial role in ensuring network robustness. However, implementing intelligent alerting systems becomes a challenge when considering scenarios in which anomalies can be cause...
bypass4netns: Accelerating TCP/IP Communications in Rootless ContainersNaoki Matsumoto, Akihiro Suda2024-02-01下载"Rootless containers" is a concept to run the entire container runtimes and containers without the root privileges. It protects the host environment from attackers exploiting container runtime vulnera...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
bypass4netns: Accelerating TCP/IP Communications in Rootless ContainersNaoki Matsumoto, Akihiro Suda2024-02-01下载"Rootless containers" is a concept to run the entire container runtimes and containers without the root privileges. It protects the host environment from attackers exploiting container runtime vulnera...

基于 VitePress 构建