Age Detection in Chat
Martell, Craig H.
MetadataShow full item record
This paper presents the results of using statistical analysis and automatic text categorization to identify an author’s age group based on the author's online chat posts. A Naive Bayesian Classifier and Support Vector Machine (SVM) model were used. The SVM model experiments generated an f-score measurement of 0.996 on test data distinguishing teens from adults. We also introduce an alternative method for generating “stop words” that chooses n-grams based on their relative distribution across the classes.
RightsThis publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.