An improved unsupervised modeling methodology for detecting fraud in vendor payment transactions
Rouillard, Gregory W.
Buttrey, Samuel E.
Whitaker, Lyn R.
MetadataShow full item record
(DFAS) vendor payment transactions through Unsupervised Modeling (cluster analysis). Clementine Data Mining software is used to construct unsupervised models of vendor payment data using the K-Means, Two Step, and Kohonen algorithms. Cluster validation techniques are applied to select the most useful model of each type, which are then combined to select candidate records for physical examination by a DFAS auditor. Our unsupervised modeling technique utilizes all the available valid transaction data, much of which is not admitted under the current supervised modeling procedure. Our procedure standardizes and provides rigor to the existing unsupervised modeling methodology at DFAS. Additionally, we demonstrate a new clustering approach called Tree Clustering, which uses Classification and Regression Trees to cluster data with automatic variable selection and scaling. A Recommended SOP for Unsupervised Modeling, detailed explanation of all Clementine procedures, and implementation of the Tree Clustering algorithm are included as appendices.
Approved for public release; distribution is unlimited.
Showing items related by title, author, creator and subject.
Detection of erroneous payments utilizing supervised and utilizing supervised and unsupervised data mining techniques Yanik, Todd E. (Monterey, California. Naval Postgraduate School, 2004-09);In this thesis we develop a procedure for detecting erroneous payments in the Defense Finance Accounting Service, Internal Review's (DFAS IR) Knowledge Base Of Erroneous Payments (KBOEP), with the use of supervised (Logistic ...
Combining spectral and spatial information into hidden Markov models for unsupervised image classifciation Tso, B.; Olsen, R.C. (2005-05);Unsupervised classification methodology applied to remote sensing image processing can provide benefits in automatically converting the raw image data into useful information so long as higher classification accuracy is ...
Cheng, Leon (Monterey, California: Naval Postgraduate School, 2013-09);With the vast amount of information and public comment available online, it is of increasing interest to understand what is being said and what topics are trending online. Government agencies, for example, want to know ...