PREDICTING COLLECTIVE VIOLENCE FROM COORDINATED HOSTILE INFORMATION CAMPAIGNS IN SOCIAL MEDIA
Loading...
Authors
Mendieta, Milton V.
Subjects
violence prediction
hostile information campaigns
multilingual language models
NLP
social media
deep learning
Twitter.
hostile information campaigns
multilingual language models
NLP
social media
deep learning
Twitter.
Advisors
Warren, Timothy C.
Yoshida, Ruriko
Date of Issue
2022-12
Date
Publisher
Monterey, CA; Naval Postgraduate School
Language
Abstract
The ability to predict conflicts prior to their occurrence can help deter the outbreak of collective violence and avoid human suffering. Existing approaches use statistical and machine learning models, and even social network analysis techniques; however, they are generally confined to long-range predictions in specific regions and are based on only a few languages. Understanding collective violence from signals in multiple or mixed languages in social media remains understudied. In this work, we construct a multilingual language model (MLLM) that can accept input from any language in social media, a model that is language-agnostic in nature. The purpose of this study is twofold. First, it aims to collect a multilingual violence corpus from archived Twitter data using a proposed set of heuristics that account for spatial-temporal features around past and future violent events. And second, it attempts to compare the performance of traditional machine learning classifiers against deep learning MLLMs for predicting message classes linked to past and future occurrences of violent events. Our findings suggest that MLLMs substantially outperform traditional ML models in predictive accuracy. One major contribution of our work is that military commands now have a tool to evaluate and learn the language of violence across all human languages. Finally, we made the data, code, and models publicly available.
Type
Thesis
Description
Series/Report No
Department
Defense Analysis (DA)
Operations Research (OR)
Organization
Identifiers
NPS Report Number
Sponsors
Funder
Format
Citation
Distribution Statement
Approved for public release. Distribution is unlimited.
Rights
Copyright is reserved by the copyright owner.