[FS#388] odhcpd: A default route is present but there is no public prefix on br-lan thus we don't an

LEDE Bugs lede-bugs at lists.infradead.org
Sat Jan 14 20:30:58 PST 2017


A new Flyspray task has been opened.  Details are below. 

User who did this - Dave Täht (dtaht) 

Attached to Project - LEDE Project
Summary - odhcpd: A default route is present but there is no public prefix on br-lan thus we don't an
Task Type - Bug Report
Category - Base system
Status - Unconfirmed
Assigned To - 
Operating System - All
Severity - High
Priority - Very Low
Reported Version - Trunk
Due in Version - Undecided
Due Date - Undecided
Details - Supply the following if possible:
 - Device problem occurs on: uaplite connecting to archerc7v2
 - Software versions of LEDE release, packages: head as of today

I do have dnsmasq-full installed and this:

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'

(I will back off dnsmasq-full now that I've learned that the flakyness was triggered by the ra going away, but I had long assumed dnsmasq won't do ras unless you tell it accept_ra. Should I try making odhcpd the main dhcpv4 server?)

 - Steps to reproduce: 

Topology:

ComcastModem -> archerc7v2 -> uap-lite -> wifi
   
odhcpd every ~minute says "A default route is present but there is no public prefix on br-lan thus we don't announce a default route!", even though showing ip -6 addr show shows 

24: br-lan:  mtu 1500 state UP qlen 1000
    inet6 fdaf:dc63:6de9:10::1/60 scope global noprefixroute 
       valid_lft forever preferred_lft forever
    inet6 2601:646:4180:elided::1/64 scope global noprefixroute dynamic 
       valid_lft 332453sec preferred_lft 332453sec
    inet6 fe80::32b5:c2ff:fe75:7faa/64 scope link 
       valid_lft forever preferred_lft forever

as fast as I can poll it.

The router announcement shows an advertised lifetime of 0 alternating with 64k every 30sec or so.

But for no reason I understand that is seemingly ok. My IPv6 connectivity seems to keep working.

BUT:

Add a client attempting to get a dhcpv6-pd address further downstream, and failing,
and all hell breaks loose.

Snipping from the log 


````
Sat Jan 14 19:51:25 2017 daemon.info dnsmasq-dhcp[3964]: DHCPDISCOVER(br-lan) 80:2a:a8:86:34:17 
Sat Jan 14 19:51:25 2017 daemon.info dnsmasq-dhcp[3964]: DHCPOFFER(br-lan) 172.26.16.241 80:2a:a8:86:34:17 
Sat Jan 14 19:51:25 2017 daemon.info dnsmasq-dhcp[3964]: DHCPDISCOVER(br-lan) 80:2a:a8:86:34:17 
Sat Jan 14 19:51:25 2017 daemon.info dnsmasq-dhcp[3964]: DHCPOFFER(br-lan) 172.26.16.241 80:2a:a8:86:34:17 
Sat Jan 14 19:51:25 2017 daemon.info dnsmasq-dhcp[3964]: DHCPREQUEST(br-lan) 172.26.16.241 80:2a:a8:86:34:17 
Sat Jan 14 19:51:25 2017 daemon.info dnsmasq-dhcp[3964]: DHCPACK(br-lan) 172.26.16.241 80:2a:a8:86:34:17 
Sat Jan 14 19:51:26 2017 daemon.warn odhcpd[1047]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!
Sat Jan 14 19:51:26 2017 daemon.warn odhcpd[1047]: DHCPV6 SOLICIT IA_NA from 00030001802aa8863417 on br-lan: no addresses available 
Sat Jan 14 19:51:26 2017 daemon.warn odhcpd[1047]: DHCPV6 SOLICIT IA_PD from 00030001802aa8863417 on br-lan: no prefix available 
Sat Jan 14 19:51:28 2017 daemon.warn odhcpd[1047]: DHCPV6 SOLICIT IA_PD from 00030001802aa8863417 on br-lan: no prefix available 
````

At this point (a few seconds before I see the solicit in the log) the ra induced route is withdrawn and all hosts lose ipv6 connectivity for about 30 seconds. (the pd request also fails)


````
64 bytes from prod.lwn.net: icmp_seq=888 ttl=50 time=78.5 ms
>From prod.lwn.net icmp_seq=889 Destination unreachable: Unknown code 5
>From prod.lwn.net icmp_seq=890 Destination unreachable: Unknown code 5
>From prod.lwn.net icmp_seq=891 Destination unreachable: Unknown code 5
>From prod.lwn.net icmp_seq=892 Destination unreachable: Unknown code 5
...
>From prod.lwn.net icmp_seq=920 Destination unreachable: Unknown code 5
64 bytes from prod.lwn.net: icmp_seq=921 ttl=50 time=81.1 ms
64 bytes from prod.lwn.net: icmp_seq=922 ttl=50 time=82.7 ms
````

And it repeats about every 1-2 minutes. Believe me, having ipv6 working only 50-75% of the time is maddening!

http://www.taht.net/~d/dhcpv6bug/ipv6advert.png - the normal advert

http://www.taht.net/~d/dhcpv6bug/ipv6retract.png - then a short one with lifetime 0

There's a packet capture in the same dir.

...

While debugging and simplifying this today I also eliminated the multicast-unicast code as proximate causes.

        option igmp_snooping '0'      
        option multicast_to_unicast '0'

...

I've seen a few other bug reports like this around, perhaps I've made some progress. (I was originally triggering this chaos with the edgerouter with dhcp-pd requests, now it's lede-head throughout, and pure ethernet rather than a wifi bridge) There has occasionally been prefixes available, but not at the moment, and the effect is the same with or without a prefix being offered.

So my guess is odhcp is not successfully polling for the addresses on the interface (sometimes). Could be subtler.

More information can be found at the following URL:
https://bugs.lede-project.org/index.php?do=details&task_id=388



More information about the lede-bugs mailing list