Do Word Clues Suffice in Detecting Spam and Phishing?

Loading...
Thumbnail Image
Authors
Rowe, Neil C.
Barnes, David S.
McVicker, Michael
Egan, Melissa
David, Duane T.
Guiterrez, Louis
Martell, Craig H.
Subjects
spam
phishing
clues
words
testing
Advisors
Date of Issue
2007-06
Date
June 2007
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
Some commercial antispam and anti-phishing products prohibit email from “blacklisted” sites that they claim send spam and phishing email, while allowing email claiming to be from “whitelisted” sites they claim are known not to send it. This approach tends to unfairly discriminate against smaller and less-known sites, and would seem to be anti-competitive. An open question is whether other clues to spam and phishing would suffice to identify it. We report on experiments we have conducted to compare different clues for automated detection tools. Results show that word clues were by far the best clues for spam and phishing, although a little bit better performance could be obtained by supplementing word clues with a few others like the time of day the email was sent and inconsistency in headers. We also compared different approaches to combining clues to spam such as Bayesian reasoning, case-based reasoning, and neural networks; Bayesian reasoning performed the best. Our conclusion is that Bayesian reasoning on word clues is sufficient for antispam software and that blacklists and whitelists are unnecessary.
Type
Conference Paper
Description
This paper appeared in the Proceedings of the 8th IEEE Workshop on Information Assurance, West Point, NY, June 2007.
Series/Report No
Department
Organization
Identifiers
NPS Report Number
Sponsors
supported in part by the National Science Foundation under the Cyber Trust Program
Funder
Format
Citation
Proceedings of the 8th IEEE Workshop on Information Assurance, West Point, NY, June 2007.
Distribution Statement
Approved for public release; distribution is unlimited.
Rights
Collections