Multiple additive regression trees: a methodology for predictive data mining for fraud detection
Loading...
Authors
Monteiro, Antonio Jorge Ferreira da Silva.
Subjects
Fraud
Data mining
MART
Classification trees
Relative importance of variables
Missing values
Data mining
MART
Classification trees
Relative importance of variables
Missing values
Advisors
Whitaker, Lyn R.
Date of Issue
2002-09
Date
September 2002
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
The Defense Finance Accounting Service DFAS-Operation Mongoose (Internal Review - Seaside) is using new and innovative techniques for fraud detection. Their primary techniques for fraud detection are the data mining tools of classification trees and neural networks as well as methods for pooling the results of multiple model fits. In this thesis a new data mining methodology, Multiple Additive Regression Trees (MART) is applied to the problem of detecting potential fraudulent and suspect transactions (those with conditions needing improvement - CNI's). The new MART methodology is an automated method for pooling a "forest" of hundreds of classification trees. This study shows how MART can be applied to fraud data. In particular it shows how MART identified classes of important variables and that MART is as effective with raw input variables as it is with the categorical variables currently constructed individually by DFAS. MART is also used to explore the effects of the substantial amount of missing data in the historical fraud database. In general MART is as accurate as existing methods, requires much less effort to implement saving many man-days, handles missing values in a sensible and transparent way, and provides features such as identifying more important variables.
Type
Thesis
Description
Series/Report No
Department
Operations Research (OR)
Organization
Naval Postgraduate School (U.S.)
Identifiers
NPS Report Number
Sponsors
Operation Mongoose / DFAS, Seaside, CA
Funder
Format
xviii, 93 p. : ill. (some col.)
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
Copyright is reserved by the copyright owner.