Comcast/Netgear routers eat SRV records

I've been troubleshooting an issue with two Comcast users in the Memphis area. One is a recent Comcast subscriber, and the other has been using Comcast for months, noticed this issue, and invented a kludgey workaround on his own that involves several VPNs.

After lots of head-scratching, I was able to determine that the users' local DNS is in fact broken and does not return SRV records.

For example, take this query against Google's public DNS:

; <<>> DiG 9.8.5-P1 <<>> srv _xmpp-client._tcp.google.com @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10142
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_xmpp-client._tcp.google.com.    IN  SRV

;; ANSWER SECTION:
_xmpp-client._tcp.google.com. 900 IN    SRV 20 0 5222 alt2.xmpp.l.google.com.
_xmpp-client._tcp.google.com. 900 IN    SRV 20 0 5222 alt3.xmpp.l.google.com.
_xmpp-client._tcp.google.com. 900 IN    SRV 20 0 5222 alt1.xmpp.l.google.com.
_xmpp-client._tcp.google.com. 900 IN    SRV 20 0 5222 alt4.xmpp.l.google.com.
_xmpp-client._tcp.google.com. 900 IN    SRV 5 0 5222 xmpp.l.google.com.

;; Query time: 36 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Fri Nov 01 09:42:32 CDT 2013
;; MSG SIZE  rcvd: 251

Running against Level3 DNS (4.2.2.2) and OpenDNS (208.67.222.222) return the same results. There are SRV records to be found.

Here's the same query run by an affected user with their default settings as installed by Comcast:

; <<>> DiG 9.8.5-P1 <<>> srv _xmpp-client._tcp.google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31875
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_xmpp-client._tcp.google.com.    IN  SRV

;; Query time: 5 msec
;; SERVER: 172.16.12.1#53(172.16.12.53)
;; WHEN: Fri Nov 01 07:22:43 CDT 2013
;; MSG SIZE  rcvd: 46

The SRV records missing, but it doesn't stop there. Nonexistent names should come back as NXDOMAIN, not NOERROR. What's worse, the user's local DNS server has the balls to set the AA "Authoritative Answer" bit, something that only the actual google.com DNS servers can claim. And of course, to top off the wrongness, the device claiming to be authoritative leaves the authority section blank.

The hardware at fault is a Comcast-provided router, a Netgear WNR1000v2-VC. I don't know who wrote their DNS software, but it's totally and completely wrong. (Why can't they use dnsmasq or dnscache?) A vaguely-related Netgear device got a firmware update some three years ago that corrected a similar issue, yet the issue persists here.

My selected workaround is for the affected users to stop using their broken devices for DNS. Google public DNS and OpenDNS are both viable options, and both handle queries properly.