[V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI
河合英宏 / KAWAI,HIDEHIRO
hidehiro.kawai.ez at hitachi.com
Fri Jul 31 04:23:00 PDT 2015
> From: Michal Hocko [mailto:mhocko at kernel.org]
>
> On Thu 30-07-15 11:55:52, 河合英宏 / KAWAI,HIDEHIRO wrote:
> > > From: Michal Hocko [mailto:mhocko at kernel.org]
> [...]
> > > Could you point me to the code which does that, please? Maybe we are
> > > missing that in our 3.0 kernel. I was quite surprised to see this
> > > behavior as well.
> >
> > Please see the snippet below.
> >
> > void setup_local_APIC(void)
> > {
> > ...
> > /*
> > * only the BP should see the LINT1 NMI signal, obviously.
> > */
> > if (!cpu)
> > value = APIC_DM_NMI;
> > else
> > value = APIC_DM_NMI | APIC_LVT_MASKED;
> > if (!lapic_is_integrated()) /* 82489DX */
> > value |= APIC_LVT_LEVEL_TRIGGER;
> > apic_write(APIC_LVT1, value);
> >
> >
> > LINT1 pins of cpus other than CPU 0 are masked here.
> > However, at least on some of Hitachi servers, NMI caused by NMI
> > button doesn't seem to be delivered through LINT1. So, my `external NMI'
> > word may not be correct.
>
> I am not familiar with details here but I can tell you that this
> particular code snippet is the same in our 3.0 based kernel so it seems
> that the HW is indeed doing something differently.
Yes, and it turned out my PATCH 3/3 doesn't work at all on some
hardware...
> > > You might still get a panic on hardlockup which will happen on all CPUs
> > > from the NMI context so we have to be able to handle panic in NMI on
> > > many CPUs.
> >
> > Do you say about the case of a kerne panic while other cpus locks up
> > in NMI context? In that case, there is no way to do things needed by
> > kdump procedure including saving registeres...
>
> I am saying that watchdog_overflow_callback might trigger on more CPUs
> and panic from NMI context as well. So this is not reduced to the NMI
> button sends NMI to more CPUs.
I understand. So, I have to also modify watchdog_overflow_callback
to call nmi_panic().
> Why cannot the panic() context save all the registers if we are going to
> loop in NMI context? This would be imho preferable to returning from
> panic IMO.
I'm not saying we cannot save registers and do some cleanups in NMI
context. I fell that it would introduce unneeded complexity.
Since watchdog_overflow_callback is defined as generic code,
we need to implement the preparation for kdump for other architectures.
I haven't checked which architectures support both nmi watchdog and
kdump, though.
Anyway, I came up with a simple solution for x86. Waiting for the
timing of nmi_shootdown_cpus() in nmi_panic(), then invoke the
callback registered by nmi_shootdown_cpus().
> > > I can provide the full log but it is quite mangled. I guess the
> > > CPU130 was the only one allowed to proceed with the panic while others
> > > returned from the unknown NMI handling. It took a lot of time until
> > > CPU130 managed to boot the crash kernel with soft lockups and RCU stalls
> > > reports. CPU0 is most probably locked up waiting for CPU130 to
> > > acknowledge the IPI which will not happen apparently.
> >
> > There is a timeout of 1000ms in nmi_shootdown_cpus(), so I don't know
> > why CPU 130 waits so long. I'll try to consider for a while.
>
> Yes, I do not understand the timing here either and the fact that the
> log is a complete mess in the important parts doesn't help a wee bit.
I'm interested in where "kernel panic -not syncing: " is.
It may give us a clue.
Regards,
Kawai
More information about the kexec
mailing list