Randomization testing of machine induced rules

Loading...
Thumbnail Image
Authors
Berry, Eric Dean.
Advisors
Ramesh, B.
Haga, William J.
Second Readers
Subjects
Date of Issue
1995-03
Date
September 1995
Publisher
Monterey, California. Naval Postgraduate School
Language
en_US
Abstract
The Department of Defense (DOD) possesses tremendous amounts of data stored in many large databases. Given the size of these databases large scale data analysis tools are required to find previously unknown and interesting patterns. Data Mining tools which produce output in the form of production rules, i.e., 'If x, Then y' are preferred because the generated rules are understandable by humans and readily support decision making processes. This thesis investigates the problems associated with the statistical testing of rule generated from data mining systems. Statistical testing of rules generated by data mining systems is required to ensure that the generated rules are based on valid statistical relationships and are not the result of random variation in the underlying data. A strategy for the testing of rules using a non-parametric test known as the randomization test is implemented for the testing of rules from a prototype data mining system.
Type
Thesis
Description
Series/Report No
Department
Information Technology Management
Organization
Identifiers
NPS Report Number
Sponsors
Funding
NA
Format
79 p.
Citation
Distribution Statement
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Collections