Conversation thread extraction and topic detection in text-based chat

Loading...
Thumbnail Image
Authors
Adams, Paige Holland.
Subjects
Advisors
Martell, Craig H.
Date of Issue
2008-09
Date
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
Text-based chat systems are widely used within the Department of Defense, but the standard systems available do not provide robust capabilities for search, information retrieval, or information assurance. The objective of this research is to explore methods for the extraction of conversation threads from text-based chat systems in order to enable such tasks. As part of the research, we manually annotated over 20,000 Internet Relay Chat posts with conversation thread information and constructed a probabilistic model for automatically classifying posts according to conversation thread. We also provide an algorithm for extracting these conversation threads from the chat session in order to form discrete documents that may be used in a vector space model information retrieval system. We elaborate how this technique can be used to support search and data mining systems, as well as auditing tasks and guard functions in a security system. Using the developed probabilistic models, we have achieved classification results on par with those of human annotators.
Type
Thesis
Description
Series/Report No
Department
Computer Science
Organization
Naval Postgraduate School
Identifiers
NPS Report Number
Sponsors
Funder
Format
xiv, 175 p. : ill. ;
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
Collections