[PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

Borislav Petkov bp at alien8.de
Mon Feb 20 03:09:41 PST 2017

On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long error_code)
>  	 */
>  	int lmce = 1;
> -	/* If this CPU is offline, just bail out. */
> -	if (cpu_is_offline(smp_processor_id())) {
> +	/* If nmi shootdown happened or this CPU is offline, just bail out. */
> +	if (cpus_shotdown() ||

I don't like "cpus_shotdown" - it doesn't hint at all that this is
special-handling crash/kdump.

And more importantly, I want it to be obvious that we do let the
crashing CPU into the MCE handler.


If we didn't, you will not handle *any* MCE, even a fatal one, during
dumping memory so if that dump is corrupted from the MCE, you won't
know. And I don't want to be the one staring at the corrupted dump and
wondering why I'm seeing what I'm seeing.

IOW, if we get a fatal MCE during dumping then we should go and die.
This is much better than silently corrupting the dump and not even
saying anything about it.


Good mailing practices for 400: avoid top-posting and trim the reply.

More information about the kexec mailing list