Social media sentiment analysis and topic detection for Singapore English
Phua, Yee Ling
MetadataShow full item record
Social media has become an increasingly important part of our daily lives in the last few years. With the convenience built into smart devices, many new ways of communicating have been made possible via social-media applications. Sentiment analysis and topic detection are two growing areas in Natural Language Processing, and there are increasing trends of using them in social media analytics. In this thesis, we analyze various standard methods used in supervised sentiment analysis and supervised topic detection on social media for Colloquial Singapore English. For supervised topic detection, we created a naïve Bayes classifier that performed classification on 5000 annotated Facebook posts. We compared the result of our classifier against open source classifiers such as Support Vector Machine (SVM), Maximum Entropy and Labeled Latent Dirichlet Allocation (LDA). For supervised sentiment analysis, we developed a phrasal classifier that analyzed the polarity of 425 argumentative Facebook posts. Our naïve Bayes classifier gave the best accuracy result of 89% for supervised topic detection on two-class classification and 57% accuracy for our six-class classification. For our supervised sentiment analysis, our phrasal sentiment analysis classifier obtained an accuracy of 35.5% with negative polarity class achieving a high precision of 94.3%.
Approved for public release; distribution is unlimited.
Showing items related by title, author, creator and subject.
Sample entropy and random forests a methodology for anomaly-based intrusion detection and classification of low-bandwidth malware attacks Hyla, Bret M. (Monterey, California. Naval Postgraduate School, 2006-09);Sample Entropy examines changes in the normal distribution of network traffic to identify anomalies. Normalized Information examines the overall probability distribution in a data set. Random Forests is a supervised ...
McIver, Charles A. (Monterey, California: Naval Postgraduate School, 2017-03);Remote-sensing analysis is conducted for the Naval Postgraduate School campus, containing buildings, impervious surfaces (asphalt and concrete), natural ground, and vegetation. Data is from the Optech Titan, providing ...
Thomas, Judson J. C. (Monterey, California: Naval Postgraduate School, 2015-09);With the arrival of Optech’s Titan multispectral LiDAR sensor, it is now possible to simultaneously collect three different wavelengths of LiDAR data. Much of the work performed on multispectral LiDAR data involves gridding ...