Sample entropy and random forests a methodology for anomaly-based intrusion detection and classification of low-bandwidth malware attacks

Loading...
Thumbnail Image
Authors
Hyla, Bret M.
Subjects
Advisors
Martell, Craig
Squire, Kevin
Date of Issue
2006-09
Date
Publisher
Monterey, CA; Naval Postgraduate School
Language
Abstract
Sample Entropy examines changes in the normal distribution of network traffic to identify anomalies. Normalized Information examines the overall probability distribution in a data set. Random Forests is a supervised learning algorithm which is efficient at classifying highlyimbalanced data. Anomalies are exceedingly rare compared to the overall volume of network traffic. The combination of these methods enables low-bandwidth anomalies to easily be identified in high-bandwidth network traffic. Using only low-dimensional network information allows for near real-time identification of anomalies. The data set was collected from 1999 DARPA intrusion detection evaluation data set. The experiments compare a baseline f-score to the observed entropy and normalized information of the network. Anomalies that are disguised in network flow analysis were detected. Random Forests prove to be capable of classifying anomalies using the sample entropy and normalized information. Our experiment divided the data set into five-minute time slices and found that sample entropy and normalized information metrics were successful in classifying bad traffic with a recall of .99 and a f-score .50 which was 185% better than our baseline.
Type
Thesis
Description
Series/Report No
Organization
Naval Postgraduate School (U.S.)
Identifiers
NPS Report Number
Sponsors
Funder
Format
xvi, 62 p. ;
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
Collections