From the “DNS Is Fault-Tolerant” files:
The entire .se (Sweden) Top Level Domain was knocked offline for a few hours today (EDT), due to an error in DNS configuration. It’s an astounding revelation and one that shouldn’t technically be able to occur in my opinion.
Time and again, smart people remind me that DNS is a redundant system that is highly available. Yet here we are in 2009 and the entire .se TLD is offline because of a configuration error in DNS.
According to the .SE Internet Infrastructure Foundation, they inadvertently sent out an incorrect zone file Monday October 12 at 21.45 local time, in connection with a planned maintenance work.
“The cause was an incorrect software update, which, despite our testing procedures were not detected,” .SE said in a statement. “Thanks to well-functioning surveillance system .SE discovered the error immediately and a new file with the DNS data (zone file) was produced and distributed within one hour.”
An hour may not sound bad, but due to the way DNS works with multiple copies of records all over the world, the end result is a cascading failure of the entire .se TLD that varies in length depending on where you are. That’s 900,000 domains without service due to a DNS error that should never have happened.