[PATCH v4 13/17] watchdog/hardlockup: detect hard lockups using secondary (buddy) CPUs

Mon May 8 08:52:35 PDT 2023

Hi,

On Sun, May 7, 2023 at 6:05 PM Nicholas Piggin <npiggin at gmail.com> wrote:
>
> > No, I wasn't aware of it. Interesting, it seems to basically enable
> > both types of hardlockup detectors together. If that really catches
> > more lockups, it seems like we could do the same thing for the buddy
> > system.
>
> It doesn't catch more lockups. On powerpc we don't have a reliable
> periodic NMI hence the SMP checker. But it is preferable that a CPU
> detects its own lockup because NMI IPIs can result in crashes if
> they are taken in certain critical sections.

Ah, interesting. Is the fact that NMI IPIs can crash the system
something specific to the way they're implemented in powerpc, or is
this something common in Linux? I've been assuming that IPI NMIs would
be roughly as reliable (or perhaps more reliable) than the NMI
timer/perf counter.

> > > It's all to
> > > all rather than buddy which makes it more complicated but arguably
> > > bit better functionality.
> >
> > Can you come up with an example crash where the "all to all" would
> > work better than the simple buddy system provided by this patch?
>
> CPU2                     CPU3
> spin_lock_irqsave(A)     spin_lock_irqsave(B)
> spin_lock_irqsave(B)     spin_lock_irqsave(A)
>
> CPU1 will detect the lockup on CPU2, but CPU3's lockup won't be
> detected so we don't get the trace that can diagnose the bug.

Ah, that makes sense as a benefit if
"sysctl_hardlockup_all_cpu_backtrace" is false, which you'd probably
want if you had massively SMP systems. ...of course, if you had a lot
of cores then I'd imagine the "all-to-all" might become a lot of
overhead...

Hmmm, but I don't think you really need "all-to-all" checking to get
the stacktraces you want, do you? Each CPU can be "watching" exactly
one other CPU, but then when we actually lock up we could check all of
them and dump stacks on all the ones that are locked up. I think this
would be a fairly easy improvement for the buddy system. I'll leave it
out for now just to keep things simple for the initial landing, but it
wouldn't be hard to add. Then I think the two SMP systems  (buddy vs.
all-to-all) would be equivalent in terms of functionality?

Thinking about this more, I guess you must be assuming coordination
between the "SMP" and "non-SMP" checker. Specifically I guess you'd
want some guarantee that the "SMP" checker always fires before the
non-SMP checker because that would have a higher chance of getting a
stack trace without tickling the IPI NMI crash. ...but then once the
non-SMP checker printed its own stack trace you'd like the SMP checker
running on the same CPU to check to see if any other CPUs were locked
up so you could get their stacks as well. With my simplistic solution
of just allowing the buddy detector to be enabled in parallel with a
perf-based detector then we wouldn't have this level of coordination,
but I'll assume that's OK for the initial landing.

> Another thing I actually found it useful for is you can easily
> see if a core (i.e., all threads in the core) or a chip has
> died. Maybe more useful when doing presilicon and bring up work
> or firmware hacking, but still useful.

I'm probably biased, but for bringup work and stuff like this I
usually have kdb/kgdb enabled and then I could get stack traces for
whichever CPUs I want. :-P

-Doug