Fabricating synthetic data in support of training for domestic terrorist activity data mining research

Loading...
Thumbnail Image
Authors
Lavelle, Stephen J.
Subjects
Advisors
Garfinkel, Simson
Date of Issue
2010-09
Date
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
Data mining is a mature technology, widespread in both government and industry. The proliferation of data storage in public and private sectors has provided more information than can be expediently processed. Data mining provides a means to extract meaningful conclusions from this growing store of data. In the interests of countering criminal and terrorist activity, data mining has become a focus of law enforcement and government agencies. The use of databases containing information on persons may conflict with privacy rights and laws. Gathering public awareness of government data mining programs and databases has been accompanied with concern and investigation of these programs. Following a review of data mining and privacy issues, in 2008 the National Research Council (NRC) recommended any training in development of data mining programs involving personal data be conducted using synthesized data. This thesis seeks to present an underlying discussion of these issues, to include data mining use, a simple data synthesis model for analysis to support the validity of the NRC recommendation, and the associated difficulties encountered in the process. Included is an analysis of the inherent difficulty in creating realistic and useful data.
Type
Thesis
Description
Department
Computer Science
Organization
Naval Postgraduate School (U.S.)
Identifiers
NPS Report Number
Sponsors
Funder
Format
xvi, 87 p. : ill. ;
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
Collections