Hash-based carving: Searching media for complete files and file fragments with sector hashing and hashdb

Loading...
Thumbnail Image
Authors
Garfinkel, Simson L.
McCarrin, Michael
Subjects
Advisors
Date of Issue
2015
Date
2015
Publisher
Elsevier
Language
Abstract
Hash-based carving is a technique for detecting the presence of specific “target files” on digital media by evaluating the hashes of individual data blocks, rather than the hashes of entire files. Unlike whole-file hashing, hash-based carving can identify files that are fragmented, files that are incomplete, or files that have been partially modified. Previous efforts at hash-based carving have looked for evidence of a single file or a few files. We attempt hash-based carving with a target file database of roughly a million files and discover an unexpectedly high false identification rate resulting from common data structures in Microsoft Office documents and multimedia files. We call such blocks “non-probative blocks.” We present the HASH-SETS algorithm that can determine the presence of files, and the HASH-RUNS algorithm that can reassemble files using a database of file block hashes. Both algorithms address the problem of non-probative blocks and provide results that can be used by analysts looking for target data on searched media. We demonstrate our technique using the bulk_extractor forensic tool, the hashdb hash database, and an algorithm implementation written in Python.
Type
Article
Description
The article of record as published may be found at https://doi.org/10.1016/j.diin.2015.05.001
Series/Report No
Department
Computer Science (CS)
Organization
Identifiers
NPS Report Number
Sponsors
Funding
Format
11 p.
Citation
Garfinkel, Simson L., and Michael McCarrin. "Hash-based carving: Searching media for complete files and file fragments with sector hashing and hashdb." Digital Investigation 14 (2015): S95-S105.
Distribution Statement
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Collections