CLUSTER COMPUTING FOR AUTOMATED NETWORK ANALYSIS AT SCALE
Brida, Benjamin J.
Kragh, Frank E.
Scrofani, James W.
MetadataShow full item record
Conventional single node packet analyzers are unable to monitor network traffic at scale. In this thesis, elements of the Apache Hadoop ecosystem, including HBase, Spark, and MapReduce, are employed to conduct network traffic analysis on a large collection of network traffic. Limited analysis is conducted directly on packet capture next generation (pcapng) files on the Hadoop Distributed File System (HDFS) using MapReduce. Next, to allow for repeated analysis on the same dataset without reading all source files in their entirety for every calculation, pcapng files are parsed and relevant meta-data is bulk loaded into HBase, a Not Only Structured Query Language (NoSQL) database employing the HDFS for parallelization. This NoSQL database is then accessed via Apache Spark where pertinent data is loaded into DataFrames and additional analysis on the network traffic takes place. This research demonstrates the viability of custom, modular, automated analytics, employing open-source software to enable parallelization, to conduct traffic analysis at scale.
RightsThis publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Showing items related by title, author, creator and subject.
Wang, Beng Wei (Monterey, California. Naval Postgraduate School, 2007-03);Wireless sensor networks have been widely researched for use in both military and commercial applications. They are especially of interest to the military planners as they can be deployed in hostile environments to collect ...
Gallup, Shelley P.; Anderson, Tom; Garza, Victor (Bob); Irvine, Nelson; Wood, Brian (Woodie) (Monterey, California. Naval Postgraduate School, 2016); NPS-N16-N201-CThere is no process or system capable of detecting obfuscated network traffic on DOD networks, and the quantity of obfuscated traffic on DOD networks is unknown. The presence of obfuscated traffic on a DOD network creates ...
Kragh, Frank; Miller, Donna L.; Brida, Ben (Monterey, California: Naval Postgraduate School, 2018-04); NPS-18-M034-BProject Summary: The Marine Corps Network Efficiency Lab (MCNEL) is tasked with analyzing very large network traffic archives collected from operations in order to improve future network design, operations, and security. ...