[LEDE-DEV] libubox, procd: init process hangs

Mats Karrman mats.dev.list at gmail.com
Tue May 17 03:03:33 PDT 2016


Hi Felix, others,

I have been experiencing problems with the init scripts dispatch 
suddenly stopping (indefinitely).
This happens maybe once in 100 reboots.
After inserting a new start script that launches another daemon 
(cgrulesengd) very early in the boot process, the failures started to 
come a lot more frequently, maybe once in 10 reboots, making this a real 
issue.
I'm normally using the versions of procd and libubox selected by OpenWRT 
BB branch but I have tested the latest versions from the git repos with 
the same result.
So far I have only got this to happen on a quite fast board (ARM dual 
CorexA9 @ 1GHz).
Inserting trace prints in libubox changes behavior, also suggesting the 
problem is timing dependent.

When init hangs:
- it is still possible to log in on console
- there is always a zombie start script, e.g. S11sysctl.
- by killing a process (e.g. ubusd or cgrulesengd) the init process 
continues.
- otherwise generating an event, e.g inserting something into a USB port 
also makes the init continue.

I have traced the problem down to the "epoll_wait" call in 
libubox::uloop.c::uloop_fetch_events().
The following patch makes sure epoll_wait is never called without a timeout.
My tests show that this solves the problem.
I have been able to observe the case when the boot gets stuck and then 
continues after the 8s timeout.
However I'm not sure that this is the correct fix for the problem as 
there may be other reasons that there is no event in the first place.
Your feedback would be welcome!

BR // Mats
Currently working for Inteno Broadband Technology AB

----

diff --git a/uloop.c b/uloop.c
index ea160a0..8343bc5 100644
--- a/uloop.c
+++ b/uloop.c
@@ -256,7 +256,7 @@ static int uloop_fetch_events(int timeout)
  {
      int n, nfds;

-    nfds = epoll_wait(poll_fd, events, ARRAY_SIZE(events), timeout);
+    nfds = epoll_wait(poll_fd, events, ARRAY_SIZE(events), timeout < 0 
? 8000 : timeout);
      for (n = 0; n < nfds; ++n) {
          struct uloop_fd_event *cur = &cur_fds[n];
          struct uloop_fd *u = events[n].data.ptr;




More information about the Lede-dev mailing list