[LEDE-DEV] dnsmasq sometimes fails to start on 17.01.0
Yousong Zhou
yszhou4tech at gmail.com
Wed Mar 29 08:57:13 PDT 2017
On 24 March 2017 at 03:17, Gui Iribarren via Lede-dev
<lede-dev at lists.infradead.org> wrote:
> The sender domain has a DMARC Reject/Quarantine policy which disallows
> sending mailing list messages using the original "From" header.
>
> To mitigate this problem, the original message has been wrapped
> automatically by the mailing list software.
>
>
> ---------- Forwarded message ----------
> From: Gui Iribarren <gui at altermundi.net>
> To: Yousong Zhou <yszhou4tech at gmail.com>
> Cc: lede-dev at lists.infradead.org
> Bcc:
> Date: Thu, 23 Mar 2017 16:10:06 -0300
> Subject: Re: [LEDE-DEV] dnsmasq sometimes fails to start on 17.01.0
> Thanks both for the replies!
>
> I continued yeterday further debugging this, i played with this
> particular number in /e/i/dnsmasq
> (line 815 in http://pastebin.com/FV09f2jG)
>
> procd_add_raw_trigger "interface.*" 2000 /etc/init.d/dnsmasq reload
>
> this number, as the following link suggests
> http://wiki.prplfoundation.org/wiki/Procd_reference#procd_add_raw_trigger.28event.2C_timeout.2C_.5Bscript.5D.29
> is the number of milliseconds that procd will wait after the trigger (in
> this case, anything related to an "interface" AFAIU) before executing
> "dnsmasq reload"
>
> i put 60000 (60 seconds) and now the reload only happens once (after ~60
> seconds from the first "/e/i/dnsmasq boot") and dnsmasq starts without
> problem
>
> so it looks to me like a race condition, where two "interface.*" events
> are happening one after the other, triggering two consecutive reloads,
> the first reload doesn't finish its work before the second reload comes,
> and the second reload kills the first reload, and suicides itself for
> some reason.
>
> setting a long raw_trigger timeout works around the problem because the
> "interface.*" events happen all inside the 60 second window, and procd
> runs "/e/i/dnsmasq reload" only once
>
I cannot yet reproduce the issue, but here are some findings after
reading code of procd
The 2000ms and 60000ms above as argument of procd_add_raw_trigger is a
delay before running the specified action when the event occurs. It's
NOT a timeout to wait for the action to complete before killing it.
When the event first occurs, procd will schedule the action to be run
after a "delay". If the same kind of event happens again
- during that delay (before the action has been started), the delay
will be reset to its initial value (wait for that amount of time
again).
- after the delay and the action is still being carried out, the
action will be marked to re-run after the current one completes
I am wondering if FS#660 was caused by the same reason
https://bugs.lede-project.org/index.php?do=details&task_id=660
Regards,
yousong
More information about the Lede-dev
mailing list