[V2 PATCH 1/3] x86/panic: Fix re-entrance problem due to panic on NMI

Mon Jul 27 19:02:11 PDT 2015

Hi,

Thanks for the review.

(2015/07/27 23:34), Michal Hocko wrote:
> On Mon 27-07-15 10:58:50, Hidehiro Kawai wrote:
> [...]
>> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
>> index d05bd2e..5b32d81 100644
>> --- a/arch/x86/kernel/nmi.c
>> +++ b/arch/x86/kernel/nmi.c
>> @@ -230,7 +230,8 @@ void unregister_nmi_handler(unsigned int type, const char *name)
>>  	}
>>  #endif
>>  
>> -	if (panic_on_unrecovered_nmi)
>> +	if (panic_on_unrecovered_nmi &&
>> +	    atomic_cmpxchg(&panicking_cpu, -1, raw_smp_processor_id()) == -1)
>>  		panic("NMI: Not continuing");
> 
> Spreading the check to all NMI callers is quite ugly. Wouldn't it be
> better to introduce nmi_panic() which wouldn't be __noreturn unlike the
> regular panic.

Sure.  I'll fix it.

> The check could be also relaxed a bit and nmi_panic would
> return only if the ongoing panic is the current cpu when we really have
> to return and allow the preempted panic to finish.

It's reasonable.  I'll do that in the next version.

> Something like
[...]
> +void nmi_panic(const char *fmt, ...)

Since we can't directly pass variable arguments to a subroutine,
we have to use a macro or do like this:

void nmi_panic(const char *msg)
{
...
	panic("%s", msg);
}

If there is no objection, I'm going to use a macro.

> +{
> +	/*
> +	 * We have to back off if the NMI has preempted an ongoing panic and
> +	 * allow it to finish
> +	 */
> +	if (atomic_read(&panic_cpu) == raw_smp_processor_id())
> +		return;
> +
> +	panic();
> +}
> +EXPORT_SYMBOL(nmi_panic);
>  
>  struct tnt {
>  	u8	bit;
> 

-- 
Hidehiro Kawai
Hitachi, Ltd. Research & Development Group