[PATCH v3] x86/kdump: Handle blocked NMIs interrupt to avoid kdump crashes

Zeng Heng zengheng4 at huawei.com
Tue Feb 14 19:05:11 PST 2023


在 2023/2/15 9:01, Baoquan He 写道:
> Add kexec list to CC.
>
> On 02/14/23 at 10:49am, Peter Zijlstra wrote:
>> On Tue, Feb 14, 2023 at 05:30:46PM +0800, Zeng Heng wrote:
>>
>>>> I never remember the shutdown paths -- do we force wipe the PMU
>>>> registers somewhere before this?
>>> I have checked the panic process, and there is no wipe operation for PMU
>>> registers,
>>>
>>> which causes the watchdog bites.
>>>
>>> Do you mean we should directly disable PMU registers instead of calling
>>> `iret_to_self` to
>>>
>>> consume blocked NMI interrupts ?
>> If you don't wipe the PMU, there will be many and continuous NMIs, a
>> single IRET-to-SELF isn't going to safe you.
>>
>> Anyway, I had a bit of a grep around and I find we have:
>>
>>    kernel/events/core.c:   register_reboot_notifier(&perf_reboot_notifier);
>>
>> which should end up killing all the PMU activity. Somewhere around there
>> there's also a CONFIG_KEXEC_CORE ifdef, so I'm thinking it gets called
>> on the panic->crash-kernel path too?
> No, reboot_notifier_list is only handled in kexec reboot/reboot path,
> please see kernel_restart_prepare() invocation. Kdump path only shutdown
> key component like cpu, interrupt controller.

I would replace iret_to_self() with perf_event_exit_cpu() in kdump shutdown

path (in native_machine_crash_shutdown()).


After test, I would send v4 later.

Thanks all,

Zeng Heng





More information about the kexec mailing list