Lexical and Discourse Analysis of Online Chat Dialog
Loading...
Authors
Forsyth, Eric N.
Martell, Craig H.
Subjects
Advisors
Date of Issue
2007
Date
Publisher
Language
Abstract
One of the ultimate goals of natural language
processing (NLP) systems is understanding the
meaning of what is being transmitted, irrespective of
the medium (e.g., written versus spoken) or the form
(e.g., static documents versus dynamic dialogues).
Although much work has been done in traditional
language domains such as speech and static written
text, little has yet been done in the newer
communication domains enabled by the Internet, e.g.,
online chat and instant messaging. This is in part due
to the fact that there are no annotated chat corpora
available to the broader research community. The
purpose of this research is to build a chat corpus,
tagged with lexical (token part-of-speech labels),
syntactic (post parse tree), and discourse (post
classification) information. Such a corpus can then be
used to develop more complex, statistical-based NLP
applications that perform tasks such as author
profiling, entity identification, and social network
analysis.
Type
Article
Description
Series/Report No
Department
Computer Science (CS)
Organization
Identifiers
NPS Report Number
Sponsors
Funder
Format
Citation
Distribution Statement
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.