We’ve known for 20 years it was coming. Is this the crisis too good to waste?
There is, in a basic sense, nothing new to be said about the global computer outage of the week just past.
We know that attacks and accidents are causes of faults. We know that attacks and accidents differ little in post-event mitigations but differ crucially in pre-event protection planning.
We know that in a large system redundant components make random faults less likely to produce global faults. We know that in a large system redundant components make intentional faults more likely to produce global faults. In short, we know that redundancy can be protective or it can be risk creating.
We know that protective redundancy does not just happen, and we know that a jillion devices all alike offers no protection at all but rather the opposite. We know that the wellspring of risk is dependence, and we know that aggregate risk is proportional to aggregate dependence. Because dependence is transitive, so too is risk. That you may not yourself depend on something directly does not mean that you do not depend on it indirectly. We call the transitive reach of dependence interdependence, which is to say correlated risk. We know that complexity hides interdependence, and unacknowledged interdependence is the precondition for black swan events.
We know that markets evolve to three generalist suppliers for any widespread consumer need or want. We also know that the contraction from many suppliers to three suppliers happens faster in the absence of regulation. And we know that specialist suppliers servicing the generalists create correlated risks the generalists cannot see. We also know that specialists penetrate markets at their rate of adoption, which is proportional to their rate of innovation. We know that the rate of adoption of new tech has accelerated over time (ChatGPT: 100 million users in 60 days) and so, too, has the creation of correlated risks.
We know that total reliability can only be achieved either by driving the mean time between failures to infinity or by driving the mean time to repair to zero. We know that neither route puts monies on the bottom line, and especially so in the face of competition based on rapid innovation alone.
We know that a state of security is the absence of unmitigable surprise, not no surprise, but no surprise without preexisting mechanisms of mitigation. As such, we know that the pinnacle goal of security engineering is no silent failure, not no failure, but no silent failure—you cannot mitigate what you don’t know is happening.
Therefore, if we choose to act on what we know, then we also know that security policy and competition policy are henceforth conjoined. We cannot and will not have zero cascade failures if any tech is allowed to become universal, to become a monopoly in its sphere. We cannot and will not have an absence of unmitigable surprise if we do not require no silent failure and if we do not require pre-built, testable mitigations for detectable failures.
Thus there is nothing new to be said about the global computer outage of the week just past.
– Dan Geer, Published courtesy of Lawfare.