Using Texture Vector Analysis to Measure Computer and Device File Similarity

Loading...
Thumbnail Image
Authors
Allen, Bruce
Subjects
Advisors
Date of Issue
2021-05-10
Date
05/10/21
Publisher
Monterey, California. Naval Postgraduate School
Language
Abstract
Executable programs run on computers and digital devices. These programs are pre-installed by the device vendor or are downloaded or copied from a storage media. It is useful to study file similarity between executable files to verify valid updates, identify potential copyright infringement, identify malware, and detect other abuse of purchased software. An alternative to relying on simplistic methods of file comparison, such as comparing their hash codes to see if they are identical, is to identify the "texture" of files and then assess its similarity between files. To test this idea, we experimented with a sample of 23 Windows executable file families and 1,386 files. We identify points of similarity between files by comparing sections of data in their standard deviations, means, modes, mode counts, and entropies. When vectors are sufficiently similar, we calculate the offsets (shifts) between the sections to get them to align. Using analysis on these shifts, we can measure file similarity efficiently. By plotting similarity vs. time, we track the progression of similarity between files.
Type
Presentation
Description
Department
Identifiers
NPS Report Number
SYM-AM-21-082
Sponsors
Prepared for the Naval Postgraduate School, Monterey, CA 93943.
Naval Postgraduate School
Funder
Format
Citation
Distribution Statement
Approved for public release; distribution is unlimited.
Approved for public release; distribution is unlimited.
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.