A responsible de-identification of the Real Data Corpus: building a framework for PII management
Loading...
Authors
An, Johanna
Subjects
De-identification
risk management and assessment
Real Data Corpus
digital forensics
big data
personally identifiable information
risk management and assessment
Real Data Corpus
digital forensics
big data
personally identifiable information
Advisors
McCarrin, Michael R.
Denning, Dorothy E.
Date of Issue
2016-09
Date
Publisher
Monterey, CA; Naval Postgraduate School
Language
Abstract
De-identification methods have helped government organizations provide the public with useful information—promoting transparency and accountability while also protecting the individual privacy of the data subjects. However, due to the recent massive increase in data collection and improved methods of analysis, de-identification has become a more difficult task. This work outlines challenges and discusses procedures for making a potentially sensitive data set available to extramural researchers and institutions without significant risk to human subject privacy. We provide a detailed explanation of personally identifiable information to help us understand what forms of personally identifiable information can cause the most harm. Furthermore, we discuss the legality and ethics behind working with personally identifiable information to illustrate the importance of protecting privacy. We then offer a taxonomy of threats, vulnerabilities, and impacts and describe how these determine risk. Based on this taxonomy, we develop a framework to assess risk on the Real Data Corpus, a collection of forensic disk images containing personally identifiable information. In addition, we analyze de-identification methods such as pseudonymization and anonymization, and consider re-identification risks. Finally, we apply our framework and methodology to a real-world scenario to determine the risk of data disclosure to an extramural researcher.
Type
Thesis
Description
Series/Report No
Department
Organization
Identifiers
NPS Report Number
Sponsors
Funding
Format
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
Copyright is reserved by the copyright owner.
