[PATCH v4 4/7] arm64: Handle early CPU boot failures
Suzuki K. Poulose
Suzuki.Poulose at arm.com
Wed Feb 3 09:24:07 PST 2016
On 03/02/16 17:01, Mark Rutland wrote:
> On Mon, Jan 25, 2016 at 06:07:02PM +0000, Suzuki K Poulose wrote:
>> From: Suzuki K. Poulose <suzuki.poulose at arm.com>
>>
>> 3. CPU_PANIC_KERNEL - CPU detected some serious issues which
>> requires kernel to crash immediately. The secondary CPU cannot
>> call panic() until it has initialised the GIC. This flag can
>> be used to instruct the master to do so.
>
> When would we use this last case?
As of now, it is used when we have incompatible ASID bits.
>
> Perhaps a better option is to always throw any incompatible CPU into an
> (MMU-off) pen, and assume that it's stuck in the kernel, even if we
> could theoretically turn it off.
Right, that is another option. I am fine with either.
>> - b __no_granule_support
>> + wfi
>> + b 1b
>
> The addition of wfi seems fine, but should be mentioned in the commit
> message.
Sure.
>> struct secondary_data secondary_data;
>> +/* Number of CPUs which aren't online, but looping in kernel text. */
>> +u32 cpus_stuck_in_kernel;
>
> Why u32 rather than int?
No specific reasons, since it is going to be a quantity, which cannot be < 0,
kept it unsigned. It could be unsigned int.
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +static int op_cpu_kill(unsigned int cpu);
>> +#else
>> +static inline int op_cpu_kill(unsigned int cpu)
>> +{
>> + return -ENOSYS;
>> +}
>> +#endif
>
> There is no !CONFIG_HOTPLUG_CPU configuration any more.
Thats what I thought but then there was [1]. If you disable CONFIG_PM_SLEEP, you can
still build with !CONFIG_HOTPLUG_CPU (or in other words allnoconfig)
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/384589.html
>>
>> + /* Make sure the update to status is visible */
>> + smp_rmb();
>> secondary_data.stack = NULL;
>> + status = READ_ONCE(secondary_data.status);
>
> What is the rmb intended to order here?
It was for the complete(). But...
>> + update_cpu_boot_status(CPU_BOOT_SUCCESS);
>> + /* Make sure the status update is visible before we complete */
>> + smp_wmb();
>
> Surely complete() has appropriate barriers?
Yes, it does. We can remove it.
>> #ifdef CONFIG_HOTPLUG_CPU
>> + update_cpu_boot_status(CPU_KILL_ME);
>> /* Check if we can park ourselves */
>> if (cpu_ops[cpu] && cpu_ops[cpu]->cpu_die)
>> cpu_ops[cpu]->cpu_die(cpu);
>
> I think you need a barrier to ensure visibility of the store prior to
> calling cpu_die (i.e. you want to order an access against execution). A
> dsb is what you want -- smp_wmb() only expands to a dmb.
>
OK.
>> #endif
>> + update_cpu_boot_status(CPU_STUCK_IN_KERNEL);
>>
>> cpu_park_loop();
>
> Likewise here.
OK.
Thanks
Suzuki
More information about the linux-arm-kernel
mailing list