[PATCH] panic, kexec: Don't mutex_trylock() in __crash_kexec()

Valentin Schneider vschneid at redhat.com
Fri Jun 17 04:52:05 PDT 2022


Hi Tao,

On 17/06/22 18:42, Tao Zhou wrote:
> Hi Valentin,
>
> On Thu, Jun 16, 2022 at 01:37:09PM +0100, Valentin Schneider wrote:
>> @@ -964,24 +966,31 @@ late_initcall(kexec_core_sysctl_init);
>>   */
>>  void __noclone __crash_kexec(struct pt_regs *regs)
>>  {
>> -	/* Take the kexec_mutex here to prevent sys_kexec_load
>> -	 * running on one cpu from replacing the crash kernel
>> -	 * we are using after a panic on a different cpu.
>> +	/*
>> +	 * This should be taking kexec_mutex before doing anything with the
>> +	 * kexec_crash_image, but this code can be run in NMI context which
>> +	 * means we can't even trylock.
>>  	 *
>> -	 * If the crash kernel was not located in a fixed area
>> -	 * of memory the xchg(&kexec_crash_image) would be
>> -	 * sufficient.  But since I reuse the memory...
>> +	 * Pairs with smp_mb() in do_kexec_load() and sys_kexec_file_load()
>>  	 */
>> -	if (mutex_trylock(&kexec_mutex)) {
>> -		if (kexec_crash_image) {
>> -			struct pt_regs fixed_regs;
>> -
>> -			crash_setup_regs(&fixed_regs, regs);
>> -			crash_save_vmcoreinfo();
>> -			machine_crash_shutdown(&fixed_regs);
>> -			machine_kexec(kexec_crash_image);
>> -		}
>> -		mutex_unlock(&kexec_mutex);
>> +	WRITE_ONCE(panic_wants_kexec, true);
>> +	smp_mb();
>> +	/*
>> +	 * If we're panic'ing while someone else is messing with the crash
>> +	 * kernel, this isn't going to end well.
>> +	 */
>> +	if (READ_ONCE(kexec_loading)) {
>> +		WRITE_ONCE(panic_wants_kexec, false);
>> +		return;
>> +	}
>
> So this is from NMI. The mutex guarantee that kexec_file_load() or 
> do_kexec_load() just one of them beat on cpu. NMI can happen on more
> than one cpu. That means that here be cumulativity here IMHO.
>

If you look at __crash_kexec() in isolation yes, but if you look at panic()
and nmi_panic() only a single NMI can get in there. I think that is also
true for invocations via crash_kexec().


> kexec_file_load()/                 NMI0                     NMI1..
> do_kexec_load()
>
> set kexec_loading=true     
> smp_mb()                set panic_wants_kexec=ture
>                         smp_mb()
>                         see kexec_loading=ture and
>                           conditionally set
>                           panic_wants_kexec=false;
>                                                  set panic_wants_kexec=ture
>                                                  smp_mb()
> see panic_wants_kexec=ture
>   conditionally set
>   kexec_loading=false
>                                                  see kexec_loading=false
>                                                  do kexec nmi things.
>
> You see conditionlly set kexec_loading or panic_wants_kexec there no barrier
> there and if the cumulativity to have the effect there should be a acquire-release,
> if I am not wrong.
>
> __crash_kexec():
>
> WRITE_ONCE(panic_wants_kexec, true);
> smp_mb();
> /*
>  * If we're panic'ing while someone else is messing with the crash
>  * kernel, this isn't going to end well.
>  */
> if (READ_ONCE(kexec_loading)) {
> 	smp_store_release(panic_wants_kexec, false);
> 	return;
> }
>
> kexec_file_load()/do_kexec_load():
>
> WRITE_ONCE(kexec_loading, true);
> smp_mb();
> if (smp_load_acquire(panic_wants_kexec)) {
>     WRITE_ONCE(kexec_loading, false);
>     ...
> }
>
> For those input, I'm sure I lost and feel hot..
> I thought that change the patten to load-store and set initial
> value but failed.
>

I'm not sure if further ordering is required here, the base case being

  WRITE_ONCE(panic_wants_kexec, true);         WRITE_ONCE(kexec_loading);
  smp_mb();                                    smp_mb();
  v0 = READ_ONCE(kexec_loading);               v1 = READ_ONCE(panic_wants_kexec);

  (see SB+fencembonceonces litmus test)

Wich ensures (!v0 && !v1) is never true. If modified to:

  WRITE_ONCE(panic_wants_kexec, true);         WRITE_ONCE(kexec_loading);
  smp_mb();                                    smp_mb();
  v0 = READ_ONCE(kexec_loading);               v1 = READ_ONCE(panic_wants_kexec);  
  if (v0)                                      if (v1)
          WRITE_ONCE(panic_wants_kexec, false);        WRITE_ONCE(kexec_loading, false);

then "(!v0 && !v1) is never true" still holds, so the exclusivity is
maintained AFAICT.




More information about the kexec mailing list