[LEDE-DEV] libubox, procd: init process hangs
Mats Karrman
mats.dev.list at gmail.com
Tue May 17 03:03:33 PDT 2016
Hi Felix, others,
I have been experiencing problems with the init scripts dispatch
suddenly stopping (indefinitely).
This happens maybe once in 100 reboots.
After inserting a new start script that launches another daemon
(cgrulesengd) very early in the boot process, the failures started to
come a lot more frequently, maybe once in 10 reboots, making this a real
issue.
I'm normally using the versions of procd and libubox selected by OpenWRT
BB branch but I have tested the latest versions from the git repos with
the same result.
So far I have only got this to happen on a quite fast board (ARM dual
CorexA9 @ 1GHz).
Inserting trace prints in libubox changes behavior, also suggesting the
problem is timing dependent.
When init hangs:
- it is still possible to log in on console
- there is always a zombie start script, e.g. S11sysctl.
- by killing a process (e.g. ubusd or cgrulesengd) the init process
continues.
- otherwise generating an event, e.g inserting something into a USB port
also makes the init continue.
I have traced the problem down to the "epoll_wait" call in
libubox::uloop.c::uloop_fetch_events().
The following patch makes sure epoll_wait is never called without a timeout.
My tests show that this solves the problem.
I have been able to observe the case when the boot gets stuck and then
continues after the 8s timeout.
However I'm not sure that this is the correct fix for the problem as
there may be other reasons that there is no event in the first place.
Your feedback would be welcome!
BR // Mats
Currently working for Inteno Broadband Technology AB
----
diff --git a/uloop.c b/uloop.c
index ea160a0..8343bc5 100644
--- a/uloop.c
+++ b/uloop.c
@@ -256,7 +256,7 @@ static int uloop_fetch_events(int timeout)
{
int n, nfds;
- nfds = epoll_wait(poll_fd, events, ARRAY_SIZE(events), timeout);
+ nfds = epoll_wait(poll_fd, events, ARRAY_SIZE(events), timeout < 0
? 8000 : timeout);
for (n = 0; n < nfds; ++n) {
struct uloop_fd_event *cur = &cur_fds[n];
struct uloop_fd *u = events[n].data.ptr;
More information about the Lede-dev
mailing list