[LEDE-DEV] Lack of DNS robustness for openwrt.org

Bjørn Mork bjorn at mork.no
Sun May 6 04:24:45 PDT 2018


Hello,

I apologize for bringing up this long-standing issue at a time where you
all have need to other issues to take care of.  But it's again become a
real pressing issue, at least seen from the networks I have a presence in.

The main problem is that there still hasn't been any update to the
*technical* part of the .org delegation:

 bjorn at miraculix:~$ whois openwrt.org|grep Name
 Domain Name: OPENWRT.ORG
 Registrant Name: SPI Hostmaster
 Admin Name: SPI Hostmaster
 Tech Name: SPI Hostmaster
 Name Server: ARRAKIS.DUNE.HU
 Name Server: BELATEGEUSE.DUNE.HU

So those two listed name servers are still the *only* two servers making
a difference when following the tree from root:

bjorn at miraculix:~$ dig ns openwrt.org @a0.org.afilias-nst.info

; <<>> DiG 9.10.3-P4-Debian <<>> ns openwrt.org @a0.org.afilias-nst.info
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39054
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;openwrt.org.                   IN      NS

;; AUTHORITY SECTION:
openwrt.org.            86400   IN      NS      arrakis.dune.hu.
openwrt.org.            86400   IN      NS      belategeuse.dune.hu.

;; Query time: 159 msec
;; SERVER: 2001:500:e::1#53(2001:500:e::1)
;; WHEN: Sun May 06 12:56:35 CEST 2018
;; MSG SIZE  rcvd: 95




That would not be an issue if those two servers were inependent and
stable.  But they are not. First of all, both depend on being able to
resolve dune.hu.  So we ask one of the hu servers:

bjorn at miraculix:~$ dig ns dune.hu @a.hu

; <<>> DiG 9.10.3-P4-Debian <<>> ns dune.hu @a.hu
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53327
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;dune.hu.                       IN      NS

;; AUTHORITY SECTION:
dune.hu.                86400   IN      NS      dns4.vietnamfree.com.
dune.hu.                86400   IN      NS      arrakis.dune.hu.
dune.hu.                86400   IN      NS      belategeuse.dune.hu.

;; ADDITIONAL SECTION:
arrakis.dune.hu.        86400   IN      A       78.24.191.176
belategeuse.dune.hu.    86400   IN      A       217.20.135.200

;; Query time: 51 msec
;; SERVER: 2001:738:4:8000::48#53(2001:738:4:8000::48)
;; WHEN: Sun May 06 12:58:10 CEST 2018
;; MSG SIZE  rcvd: 150




And naturally get glue for the two servers which are in that same zone.
But none of them are answering DNS requests at the moment, from none of
the networks I have access to (which each have millions of users AFAIK).


bjorn at miraculix:~$ dig ns dune.hu @78.24.191.176

; <<>> DiG 9.10.3-P4-Debian <<>> ns dune.hu @78.24.191.176
;; global options: +cmd
;; connection timed out; no servers could be reached
bjorn at miraculix:~$ dig ns dune.hu @217.20.135.200

; <<>> DiG 9.10.3-P4-Debian <<>> ns dune.hu @217.20.135.200
;; global options: +cmd
;; connection timed out; no servers could be reached


But there is also a third server for dune.hu, so let's try that one:


bjorn at miraculix:~$ dig ns vietnamfree.com @a.gtld-servers.net

; <<>> DiG 9.10.3-P4-Debian <<>> ns vietnamfree.com @a.gtld-servers.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1957
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 6
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;vietnamfree.com.               IN      NS

;; AUTHORITY SECTION:
vietnamfree.com.        172800  IN      NS      irc.vietnamfree.com.
vietnamfree.com.        172800  IN      NS      dns4.vietnamfree.com.
vietnamfree.com.        172800  IN      NS      ns.vietnamfree.com.
vietnamfree.com.        172800  IN      NS      ns3.vietnamfree.com.
vietnamfree.com.        172800  IN      NS      dns5.vietnamfree.com.

;; ADDITIONAL SECTION:
irc.vietnamfree.com.    172800  IN      A       195.56.146.224
dns4.vietnamfree.com.   172800  IN      A       195.56.77.197
ns.vietnamfree.com.     172800  IN      A       195.56.146.224
ns3.vietnamfree.com.    172800  IN      A       202.157.185.115
dns5.vietnamfree.com.   172800  IN      A       62.165.228.216

;; Query time: 147 msec
;; SERVER: 192.5.6.30#53(192.5.6.30)
;; WHEN: Sun May 06 13:02:43 CEST 2018
;; MSG SIZE  rcvd: 215

bjorn at miraculix:~$ dig a dns4.vietnamfree.com @195.56.77.197

; <<>> DiG 9.10.3-P4-Debian <<>> a dns4.vietnamfree.com @195.56.77.197
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42806
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 6
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;dns4.vietnamfree.com.          IN      A

;; ANSWER SECTION:
dns4.vietnamfree.com.   38400   IN      A       195.56.77.197

;; AUTHORITY SECTION:
vietnamfree.com.        38400   IN      NS      ns.vietnamfree.com.
vietnamfree.com.        38400   IN      NS      dns4.vietnamfree.com.
vietnamfree.com.        38400   IN      NS      dns3.vietnamfree.com.
vietnamfree.com.        38400   IN      NS      ns3.vietnamfree.com.
vietnamfree.com.        38400   IN      NS      kaloz.vietnamfree.com.

;; ADDITIONAL SECTION:
ns.vietnamfree.com.     38400   IN      A       195.56.146.224
dns4.vietnamfree.com.   38400   IN      A       195.56.77.197
dns3.vietnamfree.com.   38400   IN      A       195.56.77.197
ns3.vietnamfree.com.    38400   IN      A       195.56.146.224
kaloz.vietnamfree.com.  38400   IN      A       78.24.191.176

;; Query time: 79 msec
;; SERVER: 195.56.77.197#53(195.56.77.197)
;; WHEN: Sun May 06 13:03:11 CEST 2018
;; MSG SIZE  rcvd: 233



Good. So we get working glue for that one.  Let's try to ask it for the
two other dune.hu servers then, since those were the ones we needed for
resolving openwrt.org,  although we already got the glue and therefore
might consider this an unnecessary step:


bjorn at miraculix:~$ dig a arrakis.dune.hu @195.56.77.197

; <<>> DiG 9.10.3-P4-Debian <<>> a arrakis.dune.hu @195.56.77.197
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42300
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 4
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;arrakis.dune.hu.               IN      A

;; ANSWER SECTION:
arrakis.dune.hu.        86400   IN      A       78.24.191.176

;; AUTHORITY SECTION:
dune.hu.                86400   IN      NS      dns4.vietnamfree.com.
dune.hu.                86400   IN      NS      arrakis.dune.hu.
dune.hu.                86400   IN      NS      belategeuse.dune.hu.

;; ADDITIONAL SECTION:
dns4.vietnamfree.com.   38400   IN      A       195.56.77.197
arrakis.dune.hu.        86400   IN      A       78.24.191.176
belategeuse.dune.hu.    86400   IN      A       81.0.124.200

;; Query time: 80 msec
;; SERVER: 195.56.77.197#53(195.56.77.197)
;; WHEN: Sun May 06 13:04:26 CEST 2018
;; MSG SIZE  rcvd: 182

bjorn at miraculix:~$ dig a belategeuse.dune.hu @195.56.77.197

; <<>> DiG 9.10.3-P4-Debian <<>> a belategeuse.dune.hu @195.56.77.197
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5119
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 4
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;belategeuse.dune.hu.           IN      A

;; ANSWER SECTION:
belategeuse.dune.hu.    86400   IN      A       81.0.124.200

;; AUTHORITY SECTION:
dune.hu.                86400   IN      NS      dns4.vietnamfree.com.
dune.hu.                86400   IN      NS      arrakis.dune.hu.
dune.hu.                86400   IN      NS      belategeuse.dune.hu.

;; ADDITIONAL SECTION:
dns4.vietnamfree.com.   38400   IN      A       195.56.77.197
arrakis.dune.hu.        86400   IN      A       78.24.191.176
belategeuse.dune.hu.    86400   IN      A       81.0.124.200

;; Query time: 96 msec
;; SERVER: 195.56.77.197#53(195.56.77.197)
;; WHEN: Sun May 06 13:04:40 CEST 2018
;; MSG SIZE  rcvd: 182



Right, so the glue in hu was wrong for belategeuse.dune.hu!!! We now
have another server address we can try, of we were smart enough not to
trust the glue.  And that one is actually responding, and listing the
third openwrt.org server too:


bjorn at miraculix:~$ dig ns openwrt.org @81.0.124.200

; <<>> DiG 9.10.3-P4-Debian <<>> ns openwrt.org @81.0.124.200
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10031
;; flags: qr aa rd; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;openwrt.org.                   IN      NS

;; ANSWER SECTION:
openwrt.org.            14400   IN      NS      belategeuse.dune.hu.
openwrt.org.            14400   IN      NS      soapstone.yuri.org.uk.
openwrt.org.            14400   IN      NS      arrakis.dune.hu.

;; ADDITIONAL SECTION:
arrakis.dune.hu.        86400   IN      A       78.24.191.176
belategeuse.dune.hu.    86400   IN      A       81.0.124.200

;; Query time: 161 msec
;; SERVER: 81.0.124.200#53(81.0.124.200)
;; WHEN: Sun May 06 13:05:46 CEST 2018
;; MSG SIZE  rcvd: 162




But the amount of failures in this delegation chain, eventually
depending on a single server with conflicting address info, is just too
much for many caching resolvers.  They return SERVFAIL for any
openwrt.org address at the moment.

This should be easy to fix:

1) update the .org delegation to include *all* NS records for the
   openwrt.org zone

2) update the .hu deletation so it provides correct glue records for all
   the servers both serving and being in the dune.hu zone

3) possibly consider adding/replacing DNS servers with more robust
  (anycasted?) solutions.  Adding or replacing secondaries should at
  least be a no-brainer

4) remove any servers which don't answer reliably. I don't have any
  statistics.  I hope you have. But I am 100% sure this isn't the first
  time I've noticed by chance that arrakis.dune.hu has been unreachable.
  Make it a hidden master if you like.  But keeping unreliable servers
  in the NS records is worse than not having them there.


That's about 10 minutes of work all together.  Making the openwrt.org
zone infinitely more reliable.


Thanks,
Bjørn





More information about the Lede-dev mailing list