[LEDE-DEV] [PATCH][ubox] Fixes log read starvation issue after threshold reached

Petr Štetiar ynezz at true.cz
Fri Jul 14 00:57:58 PDT 2017


Jo-Philipp Wich <jo at mein.io> [2017-07-12 23:34:13]:

Hi,

> Do you have an easy test case for this? Is piping ~16K plaintext to the
> logger enough to trigger the starvation? I'd like to understand the
> problem some more before judging the patch.

I was trying to fix probably similar issue back in March, but the project's
deadline was approaching rather quickly so I've moved this into my TODO list
and started using syslog from Busybox...

Here's just commit message with test case as I wasn't able to fix it properly yet:

	Attempt to fix logread's stuck remote syslog or file logging on logd's restart

	I'm experiencing some strange behaviour of logread's file logging. From
	time to time I see on some machines, that the log file is not growing
	and I'm loosing all the system log messages.

	Steps to reproduce:

	1. /etc/init.d/log stop
	2. logd -S 100 &
	3. logread -f -F /tmp/messages -p /var/run/logread.1.pid -S 100 &
	4. pkill -9 logd
	5. logd -S 100 &

	Without this patch, logread will never log any new messages to
	/tmp/messages file, since the ubus async request is using the old log
	ubus `id` from [2], but logd has now new ubus log `id` from [5].

And here is one more fix which I wasn't able to test and submit yet:

	commit ecbc9e12dc298b76e2597a9bdaff6db092e3e402
	Author: Petr Štetiar <ynezz at true.cz>
	Date:   Sun Mar 19 22:15:56 2017 +0100

	    Check for log file write error and try to recover
	    
	    Signed-off-by: Petr Štetiar <ynezz at true.cz>

	diff --git a/log/logread.c b/log/logread.c
	index 1719976..589ea5f 100644
	--- a/log/logread.c
	+++ b/log/logread.c
	@@ -191,6 +191,14 @@ static int log_notify(struct blob_attr *msg)
				getcodetext(LOG_PRI(p), prioritynames),
				(blobmsg_get_u32(tb[LOG_SOURCE])) ? ("") : (" kernel:"), m);
			ret = write(sender.fd, buf, strlen(buf));
	+               if (ret < 0) {
	+                       fprintf(stderr, "log write failed: %s\n", strerror(errno));
	+                       close(sender.fd);
	+                       sender.fd = -1;
	+                       free(str);
	+                       uloop_end();
	+                       return ret;
	+               }
		}
	 
		free(str);

> > process seems to halt silently and yet continues running.  A restart
> > of the log services fixes it.

Yep.

-- ynezz



More information about the Lede-dev mailing list