Multi-armed bandit models of network intrusion in the cyber domain
MetadataShow full item record
We model attacks against computer networks in the cyber domain from the attacker’s point of view. We consider an attacker with limited resources and time, whose goal is to maximize the expected reward earned by exploiting infected computers, while considering the risks. A computer network is represented as a graph consisting of computers or routers, where each computer has unknown expected reward and the routers connect sub-networks of computers. At time zero the attacker starts from an infected computer, called the “home computer,” while all the other computers in the network are not infected. In any given period, the attacker can try to earn a reward by exploiting the subset of infected computers, or can choose to expand by infecting adjacent computers and routers, which does not accrue any reward. However, each infected computer must be connected through other infected computers all the way to the “home computer” for the attacker to be able to exploit it (but this connectivity may be lost when attacks are detected). For the linear network model, which is a worst-case scenario from the attacker point of view, we find that the optimal number of nodes to attempt to infect is of the order square root of the time when the network is sufficiently large. Also, we determine a critical relationship between the attacker’s probability to infect a new node and the probability of detection. When this critical condition is met, the attacker should not try to infect any additional nodes.
Approved for public release; distribution is unlimited