Issue in dmesg time with lockless ring buffer

J. Avila elavila at google.com
Thu Jan 28 20:46:03 EST 2021


Hello John,

I’ve done some additional digging on my end. I tested using a 5.10.11
kernel and observed the following:

1) With the default of CONFIG_LOG_BUF_SHIFT=17, I was not able to reproduce
   the issue.
2) With CONFIG_LOG_BUF_SHIFT=20, I was able to reproduce the behavior
   mentioned before.
3) With (2) + reverting up to and including 896fbe20b4e2 (printk: use the
   lockless ringbuffer), I saw short dmesg times again.

It seems that this issue may only exist with a sufficiently big log buffer
size. Despite 1MB being a relatively uncommon size for linux kernel log
buffers, this still indicates a potential issue in the code; do you think
it's worth investigation?

Thanks,

Avila

On Mon, Jan 25, 2021 at 4:00 PM J. Avila <elavila at google.com> wrote:
>
> Hello,
>
> This dmesg uses /dev/kmsg; we've verified that we don't see this long
> dmesg time when reading from syslog (via dmesg -S).
>
> We've also tried testing this with logging daemons disabled as well as
> within initrd - both result in similar behavior.
>
> If it's relevant, this was done on a toybox shell.
>
> Thanks,
>
> Avila
>
> On Mon, Jan 25, 2021 at 5:32 AM John Ogness <john.ogness at linutronix.de> wrote:
> >
> > On 2021-01-22, "J. Avila" <elavila at google.com> wrote:
> > > When doing some internal testing on a 5.10.4 kernel, we found that the
> > > time taken for dmesg seemed to increase from the order of milliseconds
> > > to the order of seconds when the dmesg size approached the ~1.2MB
> > > limit. After doing some digging, we found that by reverting all of the
> > > patches in printk/ up to and including
> > > 896fbe20b4e2333fb55cc9b9b783ebcc49eee7c7 ("use the lockless
> > > ringbuffer"), we were able to once more see normal dmesg times.
> > >
> > > This kernel had no meaningful diffs in the printk/ dir when compared
> > > to Linus' tree. This behavior was consistently reproducible using the
> > > following steps:
> > >
> > > 1) In one shell, run "time dmesg > /dev/null"
> > > 2) In another, constantly write to /dev/kmsg
> > >
> > > Within ~5 minutes, we saw that dmesg times increased to 1 second, only
> > > increasing further from there. Is this a known issue?
> >
> > The last couple days I have tried to reproduce this issue with no
> > success.
> >
> > Is your dmesg using /dev/kmsg or syslog() to read the buffer?
> >
> > Are there any syslog daemons or systemd running? Perhaps you can run
> > your test within an initrd to see if this effect is still visible?
> >
> > John Ogness



More information about the kexec mailing list