[LEDE-DEV] [PATCH] ubus cli: wait_for: fix race causing false timeouts
Alexandru Ardelean
ardeleanalex at gmail.com
Fri Oct 7 05:15:53 PDT 2016
On Fri, Oct 7, 2016 at 3:09 PM, Felix Fietkau <nbd at nbd.name> wrote:
> On 2016-10-07 13:57, Zefir Kurtisi wrote:
>> In ubus_cli_wait_for() there is a critical section between
>> initially checking for the requested services and the
>> following handling of 'ubus.object.add' events.
>>
>> In our system we let procd (re)start services and synchronize
>> inter-service dependencies by using 'ubus wait_for' in the
>> initscripts' service_started() functions. There we observe
>> that 'wait_for' randomly is waiting for the full timeout
>> and returning UBUS_STATUS_TIMEOUT, even if the service it
>> is waiting for is already up and running.
>>
>> This happens when the service is started in the critical
>> section mentioned above. This commit adds periodic lookup
>> for the requested services while waiting for the 'add' event
>> and with that fixes the observed failure.
>>
>> Signed-off-by: Zefir Kurtisi <zefir.kurtisi at neratec.com>
> Instead of introducing yet another timer, wouldn't it also be possible
> to close this race window by registering the event handler before
> attempting the lookup?
>
> - Felix
>
> _______________________________________________
> Lede-dev mailing list
> Lede-dev at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/lede-dev
I've also seen this race.
I tried something like this:
https://github.com/commodo/ubus/commit/8c3986caaa7cd2c12f2b8907ceea54c5bdce3bd2
But never got around to doing much testing to see if the race goes
away completely.
So, I never pushed it upstream.
@Zefir, maybe you could try it ?
Thanks
Alex
More information about the Lede-dev
mailing list