Nearest neighbor classification using a density sensitive distance measurement [electronic resource]

Download
Author
Burkholder, Joshua Jeremy
Date
2009-09Advisor
Squire, Kevin
Second Reader
Kolsch, Mathias
Metadata
Show full item recordAbstract
This work proposes a density sensitive distance measurement that takes into account the density of an underlying dataset to better represent the shape of the data when measuring distance. Kernel density estimation, using kernel bandwidths determined by k -nearest neighbor distances, is used to approximate the density of the underlying dataset. A scale is applied to the resulting kernel density estimate and a line integral is performed along its surface resulting in a density sensitive distance. This work tests the utility of the proposed density sensitive distance measurement using supervised learning. k -Nearest Neighbor classification using both the proposed density sensitive distance measurement and Euclidean distance are compared on the Wisconsin Diagnostic Breast Cancer dataset and the MNIST Database of Handwritten Digits. For perspective, these classifiers are also compared to Support Vector Machine and Random Forests classifiers. Stratified 10-fold cross validation is used to determine the generalization error of each classifier. In all comparisons, k -Nearest Neighbor classification using the proposed density sensitive distance measurement had less generalization error than k -Nearest Neighbor classification using Euclidean distance. For the MNIST dataset, k -Nearest Neighbor classification using the density sensitive distance measurement also had less generalization error than both Support Vector Machine and Random Forests classification.
Collections
Related items
Showing items related by title, author, creator and subject.
-
CLASSIFICATION OF BOLIDES AND METEORS IN DOPPLER RADAR WEATHER DATA USING UNSUPERVISED MACHINE LEARNING
Smeresky, Brendon P. (Monterey, CA; Naval Postgraduate School, 2019-12);This thesis presents a method for detecting outlier meteors and bolides within Doppler radar data using unsupervised machine learning. Principal Component Analysis (PCA), k-means Clustering, and t-Distributed Statistical ... -
Two new nearest neighbor classification rules
Karo, Ciril. (Monterey, California. Naval Postgraduate School, 1998-09);Nearest Neighbor (NN) classification is a non-parametric discrimination and classification technique. In NN classification a test item is compared by some similarity measure of its multiple variables (usually a distance ... -
TACTICAL APPLICATION OF MACHINE LEARNING TECHNIQUES FOR ANALYZING AUDIT RECORD GENERATION AND UTILIZATION SYSTEM (ARGUS) DATA TO DETECT BOTNET TRAFFIC
Ross, John T., II; Males, Nathaniel J. (Monterey, CA; Naval Postgraduate School, 2021-06);Advancing botnet threats in cyberspace threaten the security of the Department of Defense (DOD) Information Network (DODIN) and have the potential to overwhelm the Defensive Cyber Forces' ability to provide timely assessments ...