Making Sense of Email Addresses on Drives
Rowe, Neil C.
McCarrin, Michael R.
MetadataShow full item record
Drives found during investigations often have useful information in the form of email addresses, which can be acquired by search in the raw drive data independent of the file system. Using these data, we can build a picture of the social networks in which a drive owner participated, even perhaps better than investigating their online profiles maintained by social-networking services, because drives contain much data that users have not approved for public display. However, many addresses found on drives are not forensically interesting, such as sales and support links. We developed a program to filter these out using a Naïve Bayes classifier and eliminated 73.3% of the addresses from a representative corpus. We show that the byte-offset proximity of the remaining addresses found on a drive, their word similarity, and their number of co-occurrences over a corpus are good measures of association of addresses, and we built graphs using this data of the interconnections both between addresses and between drives. Results provided several new insights into our test data.
Showing items related by title, author, creator and subject.
Andrzejewski, Timothy J. (Monterey, California: Naval Postgraduate School, 2017-09);Between 2005 and 2015, the world population grew by 11% while hard drive capacity grew by 95%. Increased demand for storage combined with decreasing costs presents challenges for digital forensic analysts working within ...
McCarrin, Michael; Green, Janina; Gera, Ralucca (Cornell University Library, 2018-06-14);Digital forensic analysts depend on the ability to understand the social networks of the individuals they investigate. We develop a novel method for automatically constructing these networks from collected hard drives. We ...
Taguchi, James K. (Monterey, California: Naval Postgraduate School, 2013-06);With digital storage becoming cheaper, bigger, and more prevalent, finding evidence from the hard drives collected for a case is too difficult and time consuming. Simply reading an entire drive takes hours and it takes ...