The Impact of Router Outages on the AS-level Internet
Abstract
We propose and evaluate a new metric for understanding the dependence of the AS-level Internet on individual routers. Whereas prior work uses large volumes of reachability probes to infer outages, we design an efficient active probing technique that directly and unambiguously reveals router restarts. We use our technique to survey 149,560 routers across the Internet for 2.5 years. 59,175 of the surveyed routers (40%) experience at least one reboot, and we quantify the resulting impact of each router outage on global IPv4 and IPv6 BGP reachability.
Our technique complements existing data and control plane outage analysis methods by providing a causal link from BGP reachability failures to the responsible router(s) and multi-homing configurations. While we found the Internet core to be largely robust, we identified specific routers that were single points of failure for the prefixes they advertised. In total, 2,385 routers -- 4.0% of the routers that restarted over the course of 2.5 years of probing -- were single points of failure for 3,396 IPv6 prefixes announced by 1,708 ASes. We inferred 59% of these routers were the customer-edge border router. 2,374 (70%) of the withdrawn prefixes were not covered by a less specific prefix, so 1,726 routers (2.9%) of those that restarted were single points of failure for at least one network. However, a covering route did not imply reachability during a router outage, as no previously-responsive address in a withdrawn more specific prefix responded during a one-week sample. We validate our reboot and single point of failure inference techniques with four networks, finding no false positive or false negative reboots, but find some false negatives in our single point of failure inferences.
Description
The article of record as published may be found at http://dx.doi.org/10.1145/3098822.3098858
Rights
This publication is a work of the U.S. Government as defined in Title 17, United States Code, Section 101. Copyright protection is not available for this work in the United States.Collections
Related items
Showing items related by title, author, creator and subject.
-
On Route Aggregation
Le, F.; Zhang, H.; Xie, Geoffrey (2011-12);Route Aggregation (RA), the method to supersede a set of routes by a single, more general route, is a fundamental mechanism to the Internet scalability. Yet, despite its importance, it is poorly understood. We present the ... -
VULNERABILITY ANALYSIS OF THE PHYSICAL AND LOGICAL NETWORK TOPOLOGY ON THE U.S. VIRGIN ISLANDS
Jones, Cameron (Monterey, CA; Naval Postgraduate School, 2022-03);In 2017, two hurricanes, Irma and Maria, left the U.S. Virgin Islands with a destroyed telecommunications infrastructure, demolished homes and collapsed powerlines. Even though the communications system is broken into ... -
Speedtrap: Internet-Scale IPv6 Alias Resolution
Luckie, Matthew; Beverly, Robert; Brinkmeyer, William; claffy, kc (2013-10);Impediments to resolving IPv6 router aliases have precluded understanding the emerging router-level IPv6 Internet topology. In this work, we design, implement, and validate the first {\em Internet-scale alias resolution ...