POC: Alternative solution: Re: [PATCH 0/4] printk: reimplement LOG_CONT handling

Sergey Senozhatsky sergey.senozhatsky at gmail.com
Thu Aug 13 23:34:24 EDT 2020


On (20/08/13 12:35), John Ogness wrote:
> I believe I failed to recognize the fundamental problem. The fundamental
> problem is that the pr_cont() semantics are very poor.

The semantics is pretty clear - use it only in UP early bootup,
anything else is broken :)

  /*
   * Annotation for a "continued" line of log printout (only done after a
   * line that had no enclosing \n). Only to be used by core/arch code
   * during early bootup (a continued line is not SMP-safe otherwise).
   */
  #define KERN_CONT	KERN_SOH "c"

> I now strongly believe that we need to fix those semantics by having the
> pr_cont() user take responsibility for buffering the message. Patching the
> ~2000 pr_cont() users will be far easier than continuing to twist ourselves
> around this madness.

I welcome this effort. We've been talking about the fact that pr_cont() is
not something we can ignore anymore (we have more and more SMP users of
it) since the Kernel Summit in Santa Fe, NM, but the general response back
then was "oh my god, who cares" (pretty sure this is very close to what Ted
Ts'o said during the printk session).

> Here is an example for a new pr_cont() API:
> 
>     struct pr_cont c;
> 
>     pr_cont_alloc_info(&c);
>        (or alternatively)
>     dev_cont_alloc_info(dev, &c);
> 
>     pr_cont(&c, "1");
>     pr_cont(&c, "2");
> 
>     pr_cont_flush(&c);

This might be a bit more complex.

One thing that we need to handle here, I believe, is that the context
which crashes the kernel should flush its cont buffer, because the
information there is relevant to the crash:

	pr_cont_alloc_info(&c);
	pr_cont(&c, "1");
	pr_cont(&c, "2");
	>>
	   oops
	      panic()
	<<
	pr_cont_flush(&c);

We better flush that context's pr_cont buffer during panic().

Another example:


	pr_cont_alloc_info(&c);

	for (i = 0; i < p->sz; i++)
		pr_cont(&c, p->buf[i]);
	>>
	   page fault
	    exit
	<<
	pr_cont_flush(&c);

I believe we need to preliminary flush pr_cont() in this case as well,
because the information there might be very helpful.

	-ss



More information about the kexec mailing list