One small addition to your discussion/scenario...
As has been pointed out on this list, the actual rate of changes in the root zone is on the order of a few per week. Statistically, that means your 24 hour rollback might, often, have zero effect. Now compare this to the change rate in some very large ccTLD or gTLD, which is, I would assume, measured in the thousands per day range.
Now a short quiz:
(i) Which part of the potential outage problem should we be spending a lot of energy worrying about, based on the impact of a simple halt to effective updating for a while or, in your scenario, a rollback? (ii) Why does all the energy go into worrying about the root instead? (iii) While (as has also been pointed out) the software and systems run by the root operators are fairly diverse, protecting them from easy, one-size-fits-all versions of certain types of attacks, would you care to guess at the diversity level among the servers for the typical large ccTLD or gTLD?
(iv) I can be reached, via various forwarding aliases (and, in some cases, almost by accident), using domain names that are subdomains of five different TLDs (although I use most of them sufficiently infrequently that, in the case of a COM outage, you'd probably have to phone me to find out which to use). How about you? Guess what the count is for the typical Internet user.
sigh john
--On Monday, 08 December, 2003 17:21 -0500 Dan Kolis <dank@xxxxxxxxxxxxxxxxxx> wrote:
As a (not too) humble regular DNS user as opposed to an insider... What is the worst case scenerio on this, anyway?
It seems to me our buddies and the North American power reliabability board; (whatever) would say they can't POSSIBLY fail such that power is out for days. Yet it happened. I think killed some folks here and there too.
It seems to me, I'm speaking from a skeptical approach which is always the best when the downsides big.
If all the root operators had an offline copy of there DNS entries and rolled back 24 hours in a crisis, so what? 99.99% of DNS UDP's would resolve, a few new ones would be troubled. No Anycast, no BGP, just rollback a day and reassess the systemic failure for a next plan. Turn all that off and think for a day or so.
It seems to me a smaller chance but a non-trivial one is for the whole thing to become unreliable because the (maybe) millions of subdomains get clobbered. For instance, I think I'm right that the subdomain www. {anything} is incredibly distributed. Never a SOA at a TLD ccTLD... You know what I mean.
If a "WWW snagger" rewriter virus existed that left 100% of the root servers perfect (either due to a brillant management plan, disinterest, or dumb luck, etc.) but www.{any} didn't work, the loss of functionality would be close to having the roots lost, wouldn't it?
Harder to fix, because the people involved haven't been to a fancy workshop of what if's. And there hard to contact because suddenly internet is unreliable. There was an outage in the switched telephone system much like this about 12 years ago. None of the technocrats who could fix it could find each other, so the outage persisted for a long time until an unnamed vendor! bicyled new binaries to 400 phone switches.
regards Dan
Dan Kolis - Lindsay Electronics Ltd dank@xxxxxxxxxxxxxxxxxx 50 Mary Street West, Lindsay Ontario Canada K9V 2S7 (705) 324-2196 X272 (705) 324-5474 Fax An ISO 9001 Company; /Document end