Use of probabilistic topic models for search

Loading...
Thumbnail Image
Authors
Draeger, Marco.
Subjects
Advisors
Squire, Kevin M.
Date of Issue
2009-09
Date
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
This thesis solves a common issue in search applications. Typically, the user does not know exactly which terms are used in a document he is searching for. Several attempts have been made to overcome this issue by augmenting the document model and/or the query. In this thesis, a probabilistic topic model augments the document model. Probabilistic document models are formally introduced and inference methods are derived. It is shown how these models can be used for information retrieval tasks and how a search application can be implemented. A prototype was implemented and the implementation is tested and evaluated based on benchmark corpora. The evaluation provides empirical evidence that probabilistic document models improve the retrieval performance significantly, and shows which preprocessing steps should be made before applying the model.
Type
Thesis
Description
Department
Organization
Naval Postgraduate School (U.S.)
Identifiers
NPS Report Number
Sponsors
Funder
Format
xvi, 73 p. : ill. (some col.) ;
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined
in Title 17, United States Code, Section 101. As such, it is in the
public domain, and under the provisions of Title 17, United States
Code, Section 105, is not copyrighted in the U.S.
Collections