AN EVALUATION OF USING DETERMINISTIC HEURISTICS TO ACCELERATE REINFORCEMENT LEARNING
Walton, Garret M.
Xie, Geoffrey G.
Rohrer, Justin P.
MetadataShow full item record
Neural networks frequently face long training times based on the corpus of data available to them. Reinforcement learning in particular can take a long time to attain satisfactory performance. Recent efforts to incorporate deterministic logical rules and physical laws into a neural network have met with promising results. From an existing baseline neural network that is designed to learn Pong strictly from pixel representation of the game board, this thesis adds a ball trajectory-based heuristic into the learning process and evaluates its performance. The evaluation initially shows game score improvements, but demonstrates a sharp score degradation after about 25,000 games. Another evaluation shows the heuristic incurs a training time increase of approximately 35%. More work remains for assessing the long-term viability of this approach.
RightsThis publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Showing items related by title, author, creator and subject.
Richardson, Jonathon J. (Monterey, California: Naval Postgraduate School, 2013-09);This thesis developed behaviorally anchored-rating scales (BARS) for use in evaluating Marine Corps small-units (SUs) during live and virtual decision-making (DM) training. Currently, the Marine Corps does not mandate the ...
Martinez, T.; Agrawal, B.N. (2008);This paper discusses two optical beam control testbeds developed at the Spacecraft Research and Design Center, Naval Postgraduate School, to evaluate and develop control techniques for jitter and adaptive optics control. ...
Barrett, Christopher J. (Monterey, CaliforniaNaval Postgraduate School, 2009-09);Test and evaluation is incorporated throughout both the systems engineering and Department of Defense system acquisition processes. It is the mechanism for accomplishing verification in the systems engineering process and ...