On Thursday of last week, Facebook went down. Most of us in the Internet business know this; a frighteningly large proportion of the population actually experienced it first-hand. I read more than one posting suggesting that worldwide productivity surged on Thursday. It reminded me of the episode of The Simpsons in which Marge mounts a crusade against cartoon violence on television (read Itchy and Scratchy) and succeeds in getting the cartoon watered down. (The revised theme song begins, They love!/They share!/They share and love and share!) Kids all over the world go outside to swing on swings, blow bubbles, play tag. My wife frequently cites that episode herself, wistfully, while looking alternately at our sunny backyard and then at our kids, plopped on the sofa.
Anyway, besides reminding me of old episodes of The Simpsons, Thursdays news reminded me of how quick people are to blame DNS.The initial reports of the problem had cited DNS as the cause. After doing a little resear<script src=”/admin/assets/editors/tinymce_3/jscripts/tiny_mce/themes/advanced/langs/en.js” type=”text/javascript”></script>ch into what had gone wrong (by reading the archives of the NANOG mailing list and a posting by Facebooks engineering organization explaining the cause of the problem in that admirabl…), it was obvious that, while the problem affected DNS, it wasn’t chiefly an issue with DNS.
Why the rush to blame DNS? My theory is that old web browsers caused this. Old web browsers were forever slapping error messages in front of users that suggested that DNS was at fault when, in many cases, someone had simply mistyped a URL. For many users, that error message was their first exposure to DNS. For some, it may have been their only exposure to it. It’s no wonderthat their knee-jerk reaction is to blame DNS whenever they can’t get somewhere they want to go, and that they perceive DNS as an unreliable technology.
How do you determine whether a problem really is caused by DNS? Well, if you cant resolve the domain name of a server but you can get there by specifying its IP address, thats a strong indication that theres something wrong with DNS. (Of course, you may have a difficult time finding the IP address of the server if there’s something wrong with DNS.) Or you can use a query tool such as dig to look up the names of the authoritative name servers for the zone you suspect is having problems (say facebook.com). You can query them directly using the query tool and examinethe response you get (if any). Heres an example of how to use dig to query a particular name server for the SOA record of the facebook.com zone:
% dig @<server name or IP address> soa facebook.com.+norec
(+norec means send<script src="/admin/assets/editors/tinymce_3/jscripts/tiny_mce/themes/advanced/langs/en.js" type="text/javascript"></script>a nonrecursive query.)
If you see SERVFAIL in the status field of the response, thats a positive indication that the name server is malfunctioning or misconfigured:
% dig @ns1.facebook.com. +norec
; <<>> DiG 9.6.0-APPLE-P2 <<>>@ns1.facebook.com. +norec; (1 server found);; global options: +cmd;; Got answer:;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL,id: 43062;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL:0
If you get no response at all, thats less definitive, since it may be that the site that hosts the name server and maybe the web server, too has lost Internet connectivity.
Don’t discount the possibility that you may be the one who’s lost connectivity! Be sure you can reach some name servers. The roots are a good choice:
% dig @a.root-servers.net. soa .
With just this little bit a triage, youll have a much better idea of whether DNS is really to blame.