The Islamic State battle plan: press release natural language processing
Friedlein, James R
Whitaker, Lyn R.
Whiteside, Craig A.
MetadataShow full item record
The purpose of this study is to develop methods to accelerate and enhance the analysis of Islamic State Movement text documents. We analyze a unique database collected by Dr. Craig Whiteside, which is comprised of nearly 3,000 open-source translated press releases from 2003Ð2014. Using Natural Language Processing tools, the text data is aggregated into a corpus and processed based on document term structure and frequency. In order to reduce analyst workload, we validate Whiteside's manual analysis and construct cross-validated generalized linear models to automatically classify documents into one of seven types. A cascade classification model outperforms all other models with a mean cross-validated misclassification rate of 5.71 percent. Islamic State Movement operational summaries are classified as type Celebrate. We develop a layered algorithm based on regular expressions and location searches to extract critical information from each attack event and display the details on a map using a web-based interactive R Shiny application. With the ability to automatically classify Islamic State Movement text documents and visually interact with the data contained within those classified as type Celebrate, analysts and decision makers are able to process and understand large amounts of text data more quickly and effectively.
Approved for public release; distribution is unlimited
Showing items related by title, author, creator and subject.
Durham, Jonathan S. (Monterey, California. Naval Postgraduate School, 2009-09);The ubiquity of Internet chat applications has benefited many different segments of society. It also creates opportunities for criminal enterprise, terrorism, and espionage. This thesis proposes statistical Natural Language ...
Gupta, Anjum (Monterey, California. Naval Postgraduate School, 2011-03);Automatic text document classification is a fundamental problem in machine learning. Given the dynamic nature and the exponential growth of the World Wide Web, one needs the ability to classify not only a massive number ...
Elkern, Kenneth F., Jr. (Monterey, California. Naval Postgraduate School, 1994-09);Previous automated classified document systems developed commercially or in-house to serve classified libraries with 50,000 documents or less, have been limited by excessive cost or insufficient ...