An improved unsupervised modeling methodology for detecting fraud in vendor payment transactions
Loading...
Authors
Rouillard, Gregory W.
Subjects
Advisors
Buttrey, Samuel E.
Date of Issue
2003-06
Date
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
(DFAS) vendor payment transactions through Unsupervised Modeling (cluster analysis). Clementine Data Mining software is used to construct unsupervised models of vendor payment data using the K-Means, Two Step, and Kohonen algorithms. Cluster validation techniques are applied to select the most useful model of each type, which are then combined to select candidate records for physical examination by a DFAS auditor. Our unsupervised modeling technique utilizes all the available valid transaction data, much of which is not admitted under the current supervised modeling procedure. Our procedure standardizes and provides rigor to the existing unsupervised modeling methodology at DFAS. Additionally, we demonstrate a new clustering approach called Tree Clustering, which uses Classification and Regression Trees to cluster data with automatic variable selection and scaling. A Recommended SOP for Unsupervised Modeling, detailed explanation of all Clementine procedures, and implementation of the Tree Clustering algorithm are included as appendices.
Type
Thesis
Description
Series/Report No
Department
Operations Research (OR)
Organization
Naval Postgraduate School (U.S.)
Identifiers
NPS Report Number
Sponsors
Funder
Format
xxii, 149 p. : ill. (chiefly col.) ;
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.