[RFC PATCH 2/7] printk: Simple implementation for NMI backtracing
Daniel Thompson
daniel.thompson at linaro.org
Thu Mar 19 11:48:10 PDT 2015
On 19/03/15 18:30, Peter Zijlstra wrote:
> On Thu, Mar 19, 2015 at 01:39:58PM -0400, Steven Rostedt wrote:
>>> +void printk_nmi_backtrace_complete(void)
>>> +{
>>> + struct nmi_seq_buf *s;
>>> + int len, cpu, i, last_i;
>>> +
>>> + /*
>>> + * Now that all the NMIs have triggered, we can dump out their
>>> + * back traces safely to the console.
>>> + */
>>> + for_each_possible_cpu(cpu) {
>>> + s = &per_cpu(nmi_print_seq, cpu);
>>> + last_i = 0;
>>> +
>>> + len = seq_buf_used(&s->seq);
>>> + if (!len)
>>> + continue;
>>> +
>>> + /* Print line by line. */
>>> + for (i = 0; i < len; i++) {
>>> + if (s->buffer[i] == '\n') {
>>> + print_seq_line(s, last_i, i);
>>> + last_i = i + 1;
>>> + }
>>> + }
>>> + /* Check if there was a partial line. */
>>> + if (last_i < len) {
>>> + print_seq_line(s, last_i, len - 1);
>>> + pr_cont("\n");
>>> + }
>>> +
>>> + /* Wipe out the buffer ready for the next time around. */
>>> + seq_buf_clear(&s->seq);
>>> + }
>>> +
>>> + clear_bit(0, &nmi_print_flag);
>>> + smp_mb__after_atomic();
>>
>> Is this really necessary. What is the mb synchronizing?
>>
>> [ Added Peter Zijlstra to confirm it's not needed ]
>
> It surely looks suspect; and it lacks a comment, which is a clear sign
> its buggy.
>
> Now it if tries to order the accesses to the seqbuf againt the clearing
> of the bit one would have expected a _before_ barrier, not an _after_.
It's nothing to do with the seqbuf since I added the seqbuf code myself
but the barrier was already in the code that I copied from.
In the mainline code today it looks like this as part of the x86 code
(note that call to put_cpu() in my patchset but it lives in the arch/
specific code rather than the generic code):
: /* Check if there was a partial line. */
: if (last_i < len) {
: print_seq_line(s, last_i, len - 1);
: pr_cont("\n");
: }
: }
:
: clear_bit(0, &backtrace_flag);
: smp_mb__after_atomic();
: put_cpu();
: }
The barrier was not intended to have anything to do with put_cpu()
either though since the barrier was added before put_cpu() arrived:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=554ec063982752e9a569ab9189eeffa3d96731b2
There's nothing in the commit comment explaining the barrier and I
really can't see what it is for.
Daniel.
More information about the linux-arm-kernel
mailing list