Using Context to Disambiguate Web Captions
Loading...
Authors
Rowe, Neil C.
Subjects
captions
World Wide Web
data mining
disambiguation
context
similarity
World Wide Web
data mining
disambiguation
context
similarity
Advisors
Date of Issue
2004-06
Date
June 2004
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
The easiest way to index multimedia from ordinary Web pages is to find their captions. However, captions are not
used consistently, and retrieval effectiveness for caption-based multimedia browsers is significantly poorer than that
for text retrieval. We show that statistical "context" information about the Web pages at a site can help recognize
image captions by quantifying their "representativeness". Experiments were conducted on a random sample of 5010
image captions from 3.2 million candidates from 5 million Web pages, and 1220 audio and video captions from
720,000 candidates from those same Web pages. They showed that while statistical context information was
definitely a good clue, it usually did not appear to add much beyond what good local clues in the candidate captionimage
pair itself provide, and provided no help for caption-audio and caption-video pairs.
Type
Conference Paper
Description
This paper appeared in the Internet Computing Conference, Las Vegas, NV, June 2004.
Series/Report No
Department
Computer Science (CS)
Organization
Identifiers
NPS Report Number
Sponsors
Funder
Format
Citation
Internet Computing Conference, Las Vegas, NV, June 2004.
Distribution Statement
Approved for public release; distribution is unlimited.