Balancing exploration and exploitation in agent learning
Darken, Christian J.
Alt, Jonathan K.
MetadataShow full item record
Controlling the ratio of exploration and exploitation in agent learning in dynamic environments is a continuing challenge in applying agent-learning techniques. Methods to control this ratio in a manner that mimics human behavior are required for use in the representation of human behavior in simulations, where the goal is to constrain agent-learning mechanisms in a manner similar to that observed in human cognition. The Cultural Geography (CG) model, under development in TRAC Monterey, is an agent-based social simulation. It simulates a wide variety of situations and scenarios so that a dynamic ratio between exploration and exploitation makes the decisions more sensible. As part of an attempt to improve the model, this thesis investigates enhancements to the exploration-exploitation balance by using different techniques. The work includes design of experiments with a range of factors in multiple environments and statistical analysis related to these experiments. As a main finding from this research, for small environments and for short runs techniques based on subjective utility give better results, while for long runs techniques based on time obtain higher utilities than other techniques. In more complex and bigger environments, a combined technique performed better in long runs.
RightsThis publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Showing items related by title, author, creator and subject.
Understanding optimal decision-making Critz, John W. (Monterey, California: Naval Postgraduate School, 2015-06);The military has realized that their most valuable and adaptable assets are its leaders. Understanding optimal decision-making will allow the military to more effectively train its leaders. The Cognitive Alignment with ...
Autonomous Underwater Vehicle Planning for Information Exploitation Wiseman, Adam (Monterey, California. Naval Postgraduate School, 2012-03);The ability of an Autonomous Underwater Vehicle (AUV) to dynamically plan safe routes and maneuvers in dangerous environments is directly relevant for the future of the use of AUVs in the exploration and exploitation of ...
Where do I start?: decision making in complex novel environments Diaz, Sara Katherine. (Monterey, California. Naval Postgraduate School, 2010-09);Threats to our country have never been more real, nor had more potential to impact large populations of Americans. From the homeland defense perspective, some ideology-based groups have the ability and intention to attack ...