And because you will fuck this up, too, the goal is to build for survivability.
I wasn't fired.
But we found errors in project management, found two broken cluster failover scripts, an abandoned migration, and an error in MySQL (that was already fixed in the most current version, which we could not migrate to, because of the cluster failover scripts, because of wrong project managements).
Five Whys.
They work better than five Nein!s