Modeling Naval Fleet Action Sequences and Adversary’s Responses with Deep Reinforcement Learning

Loading...
Thumbnail Image
Authors
Barton, Armon
Subjects
reinforcement learning
wargaming
Deep Q-Networks
Advisors
Date of Issue
2023-09-20
Date
Publisher
Monterey, CA; Naval Postgraduate School
Language
Abstract
In a rapidly evolving maritime warfare landscape, the U.S. Navy and its allies require their crews to quickly identify optimal strategies for vessel engagements to ensure freedom of the seas. This necessity becomes more pronounced given the potential grave consequences of sub-optimal maneuvers, as illustrated by the cases of the USS John McCain and USS Fitzgerald. Recent advancements in machine learning and artificial intelligence (AI) offer a promising solution. There have been significant strides in implementing AI to outperform human experts in complex games such as chess, poker, and StarCraft that now have the potential to also benefit real-time decision-making and wargaming in the naval domain. This study explores the potential for reinforcement learning (RL) techniques to be applied in naval contexts, which could provide valuable decision-support tools to ship captains and their staff by suggesting optimal movement strategies in complex maritime scenarios. In this study, exemplar naval scenarios were designed and modeled within a combat simulation environment, AI agents (consisting of a mix of rule-based, method-based, and value-based approaches) were designed, and the performances of these agents were evaluated and compared. The aim was to assess the agents' ability to identify optimal movements against a rule-based adversary, while also comparing these performances against human-level play. A significant finding is the exceptional performance of the Deep Q-Networks (DQN) agent over all the other AI agents assessed in this study. The DQN agent demonstrated a promising ability to identify optimal strategies in different situations—outperforming human players in some of the scenarios presented. DQN's robustness and adaptive nature allowed it to generalize and adapt to different operational contexts, thus making it an asset to investigate further for application in the U.S. Navy's decision-making processes.
Type
Report
Description
NPS NRP Executive Summary
Identifiers
NPS Report Number
Sponsors
ASN(RDA) - Research, Development, and Acquisition
Funding
This research is supported by funding from the Naval Postgraduate School, Naval Research Program (PE0605853N/2098). https://nps.edu/nrp
Chief of Naval Operations (CNO)
Format
5 p.
Citation
Distribution Statement
Approved for public release. Distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Collections