Show simple item record

dc.contributor.advisorMcCarrin, Michael
dc.contributor.advisorGondree, Mark
dc.contributor.authorBruaene, Joseph Van
dc.dateMar-16
dc.date.accessioned2016-04-29T21:19:06Z
dc.date.available2016-04-29T21:19:06Z
dc.date.issued2016-03
dc.identifier.urihttp://hdl.handle.net/10945/48487
dc.descriptionApproved for public release; distribution is unlimiteden_US
dc.description.abstractTraditional digital forensic practices have focused on individual hard disk analysis. As the digital universe continues to grow, and cyber crimes become more prevalent, the ability to make large scale cross-drive correlations among a large corpus of digital media becomes increasingly important. We propose a methodology that builds on bulk-analysis techniques to avoid operating system- and file-system specific parsing. In addition, we apply document similarity methods to forensic artifact correlation. By representing each disk image as a set of hash values corresponding to the 512-byte sectors on the disk, and calculating pair-wise similarity scores between hard disk images, we analyze a collection of disk images taken from various storage devices purchased from the secondary market. We conclude sector-based matching is sufficient to identify images in our dataset that share common DLLs, indicating similarity in their operating systems.We present a visualization of our results as an undirected graph with similarity scores represented as edge weights, and observe that disk images with common operating systems tend to align with graph clusters. Though no common set of sectors is present on all drives—even among the large fully-connected component in our graph—we find that grouping our dataset into subsets with the same operating system version does reveal sizable collections of common sectors, and achieved the best correlation between sector matches and high-level similarities in our dataset. Extending this technique to a larger dataset and continuing our investigation of the cause of sector-level matches could yield an automated method of profiling new disk images during the triage process. Moreover, this technique could be used to corroborate deductions regarding characteristics of information systems associated with target media.en_US
dc.description.urihttp://archive.org/details/largescalecrossd1094548487
dc.publisherMonterey, California: Naval Postgraduate Schoolen_US
dc.rightsThis publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United Statesen_US
dc.titleLarge scale cross-drive correlation of digital mediaen_US
dc.typeThesisen_US
dc.contributor.departmentComputer Science
dc.contributor.departmentComputer Scienceen_US
dc.subject.authorDigital Forensicsen_US
dc.subject.authorSimilarity Detectionen_US
dc.subject.authorAutomated Correlationen_US
dc.subject.authorDigital Fingerprintingen_US
dc.subject.authorApproximate Matchingen_US
dc.subject.authorBulk Analysisen_US
dc.description.serviceLieutenant, United States Navyen_US
etd.thesisdegree.nameMaster of Science in Computer Scienceen_US
etd.thesisdegree.levelMastersen_US
etd.thesisdegree.disciplineComputer Scienceen_US
etd.thesisdegree.grantorNaval Postgraduate Schoolen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record