REDUCING ADVERSARIAL FAILURES IN NEURAL NETWORKS USING “NONE OF THE ABOVE” CLASS PRIORS
Loading...
Authors
Mendolia, Alexi N.
Subjects
machine learning classification
system safety
deep learning
artificial neural networks
adversarial examples
defense
system safety
deep learning
artificial neural networks
adversarial examples
defense
Advisors
McClure, Patrick
Date of Issue
2023-12
Date
Publisher
Monterey, CA; Naval Postgraduate School
Language
Abstract
While machine learning presents an opportunity for increased automation in systems, machine-learning models are also subject to adversarial attacks. This thesis builds on previous methods for securing against adversarial examples by training a model with a "None of the Above" (NOTA) class. While classification models force categorization into one of a fixed number of classes, NOTA models implement an additional class allowing for the notion that some inputs will not "match" any of the given classes. While previous methods are largely successful in providing state of the art adversarial robustness, they are less successful against some of the more complex adversarial attack vectors. This thesis aims to increase adversarial robustness through a prior that biases predictions to be the NOTA class. We conduct a validation grid search to find the prior probability for a NOTA class over the CIFAR-10 image dataset that best decreases adversarial success. Through this work, we are able to provide a proof-of-concept that the addition of a NOTA-biased prior can decrease the adversarial success of some of the more complex evasion attacks. As the DoD moves to increase its use of machine learning models, these results will be increasingly important towards building models with adequate security.
Type
Thesis
Description
Series/Report No
Department
Computer Science (CS)
Organization
Identifiers
NPS Report Number
Sponsors
Funding
Format
Citation
Distribution Statement
Approved for public release. Distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.
