Unsupervised topic discovery by anomaly detection
MetadataShow full item record
With the vast amount of information and public comment available online, it is of increasing interest to understand what is being said and what topics are trending online. Government agencies, for example, want to know what policies concern the public without having to look through thousands of comments manually. Topic detection provides automatic identification of topics in documents based on the information content and enhances many natural language processing tasks, including text summarization and information retrieval. Unsupervised topic detection, however, has always been a difficult task. Methods such as Latent Dirichlet Allocation (LDA) convert documents from word space into document space (weighted sums over topic space), but do not perform any form of classification, nor do they address the relation of generated topics with actual human level topics. In this thesis we attempt a novel way of unsupervised topic detection and classification by performing LDA and then clustering. We propose variations to the popular K-Mean Clustering algorithm to optimize the choice of centroids, and we perform experiments using Facebook data and the New York Times (NYT) corpus. Although the results were poor for the Facebook data, our method performed acceptably with the NYT data. The new clustering algorithms also performed slightly and consistently better than the normal K-Means algorithm.
Approved for public release; distribution is unlimited
Showing items related by title, author, creator and subject.
Horne, Robin M. (1988-09);There seems to be a perception that the Communications Subspecialty is not career enhancing for naval officers, especially warfare specialists. This thesis investigates how the subspecialty is perceived by naval officers ...
Measuring the value of graduate information technology education for Marine Officers: a proof of concept study Branstetter, Terry L. (Monterey, California. Naval Postgraduate School, 2002-12);This research examines a process to estimate the value of graduate education. Moreover, it demonstrates an approach to measuring the use of graduate education within organizations. Marine Corps officers who graduated from ...
Azzarello, Jon Joseph (Monterey, California. Naval Postgraduate School, 1983);This paper is an analysis of Morskoy Sbornik with an emphasis on history, missions and Soviet perceptions. First there is a discussion of Morskoy Sbornik's history, starting with the Czarist period and then following it ...