[PATCH] ARM: kexec: offline non panic CPUs on Kdump panic
Stephen Warren
swarren at wwwdotorg.org
Fri Jul 26 13:08:07 EDT 2013
On 07/26/2013 04:49 AM, Will Deacon wrote:
> [Adding Stephen Warren since he has been working in this area]
>
> On Fri, Jul 26, 2013 at 06:41:27AM +0100, vijay.kilari at gmail.com wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar at caviumnetworks.com>
>>
>> In case of normal kexec kernel load, all cpu's are offlined
>> before calling machine_kexec() under kernel_kexec() function.
>> But in case crash panic cpus are relaxed in
>> machine_crash_nonpanic_core() SMP function but not offlined.
>>
>> When crash kernel is loaded with kexec and on panic trigger
>> machine_kexec() checks for number of cpus online.
>> If more than one cpu is online machine_kexec() fails to load
>> with below error
>>
>> kexec: error: multiple CPUs still online
>>
>> In machine_crash_nonpanic_core() SMP function, offline CPU
>> before cpu_relax
>> diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
>> @@ -73,6 +73,7 @@ void machine_crash_nonpanic_core(void *unused)
>> crash_save_cpu(®s, smp_processor_id());
>> flush_cache_all();
>>
>> + set_cpu_online(smp_processor_id(), false);
>> atomic_dec(&waiting_for_crash_ipi);
>> while (1)
>> cpu_relax();
>
> Ok, I guess this will work since the new kernel is loaded somewhere higher
> in memory and the crashed kernel will stick around, so the non-crashing CPUs
> can sit around spinning.
Does a kernel that's used as the crash kernel guarantee:
* Never to re-use the memory that was used by the previous kernel, so
that the spin loop code/data won't be corrupted, ever, no matter how
long the crash recovery kernel runs.
* Not use SMP, so there's never a need to re-activate the non-boot CPUs,
which might not work if they aren't truly disabled but rather just
running a pin loop?
More information about the kexec
mailing list