Transport traffic analysis for abusive infrastructure characterization
MetadataShow full item record
We investigate a promising approach that identifies discriminating features of likely communications involving abusive hosts from per-packet TCP header and timing information. These features identify congestion, flow-control, and other low-level network and system characteristics indicative of an abusive network host. Our approach is IP address and content agnostic, and therefore privacy-preserving to permit wider deployment than previously possible. Importantly, the modeled characteristics are inherent to the poorly connected, under-provisioned, low-end, and overloaded hosts or links typical of abusive infrastructure making them difficult for an adversary to manipulate. In contrast to existing network-centric approaches reliant on flow-level records, fine-grained per-packet features yield superior performance with negligible processing impact. On real-world traces from accessing 40,000 Alexa and 30,000 known-abusive web sites, we achieve a classification accuracy of 94% with a 3% false positive rate using only transport features.