[PATCH v3 0/2] arm64: Allow erratum 1418040 for late CPUs
Sai Prakash Ranjan
saiprakash.ranjan at codeaurora.org
Fri Sep 11 12:34:26 EDT 2020
On 2020-09-11 19:00, Marc Zyngier wrote:
> On 2020-09-10 14:43, Sai Prakash Ranjan wrote:
>> On 2020-09-09 20:23, Doug Anderson wrote:
>>> Hi,
>>>
>>> On Fri, Aug 21, 2020 at 11:15 AM Catalin Marinas
>>> <catalin.marinas at arm.com> wrote:
>>>>
>>>> On Fri, 31 Jul 2020 18:38:22 +0100, Marc Zyngier wrote:
>>>> > Erratum 1418040 currently prevents a late CPU from booting if none
>>>> > of the early CPUs are affected by it. This is because the handling
>>>> > is implemented as alternatives, and we have already got rid of them
>>>> > by the time userspace onlines a new CPU.
>>>> >
>>>> > A solution to this is to move everything into C code, and rely on
>>>> > static keys instead. Once this is done, the feature can be allowed
>>>> > for late CPUs.
>>>> >
>>>> > [...]
>>>>
>>>> Applied to arm64 (for-next/fixes), thanks!
>>>>
>>>> [1/2] arm64: Move handling of erratum 1418040 into C code
>>>> https://git.kernel.org/arm64/c/d49f7d7376d0
>>>> [2/2] arm64: Allow booting of late CPUs affected by erratum 1418040
>>>> https://git.kernel.org/arm64/c/bf87bb0881d0
>>>
>>> NOTE: patch 2 seems to have come in through a stable merge onto
>>> Chrome
>>> OS 5.4 and is causing a regression when resuming from suspend. In
>>> the
>>> short term we've got a revert going into our tree:
>>>
>>> https://crrev.com/c/2399101
>>>
>>> ...but that's obviously not a long term fix. I haven't done any
>>> debugging of this myself, though I can if there's nobody more
>>> qualified to do it and/or nobody else has time. I'm just trying to
>>> make sure that the problem is reported somewhere where others might
>>> notice it rather than in an obscure Chrome OS tree. ;-)
>>>
>>
>> The rootcause is pretty straightforward however I'm afraid the
>> solution isn't so but I may be mistaken, so this happens on
>> big.LITTLE systems with CPUs differing in erratum 1418040
>> which was applicable only for big cores and not little cores.
>> So when trying to bringup little cores during resume, there
>> is a conflict as below (messages snipped from the internal bug
>> for more visibility).
>>
>> Enabling non-boot CPUs ...
>> CPU features: CPU1: Detected conflict for capability 35 (ARM erratum
>> 1418040), System: 1, CPU: 0
>> CPU1: will not boot
>> CPU1: will not boot
>> CPU1: failed to come online
>> psci: CPU1 killed (polled 0 ms)
>> CPU1: died during early boot
>> Error taking CPU1 up: -5
>
> This is becoming very annoying... By allowing the buggy CPUs to come
> in late, we have made it impossible for the good ones to work
> correctly.
>
> Can you try this (untested yet, I'm dealing with another bucket of
> errata at the moment):
>
> diff --git a/arch/arm64/kernel/cpu_errata.c
> b/arch/arm64/kernel/cpu_errata.c
> index 6c8303559beb..fcf7f763400c 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -477,6 +477,7 @@ const struct arm64_cpu_capabilities arm64_errata[]
> = {
> .capability = ARM64_WORKAROUND_1418040,
> ERRATA_MIDR_RANGE_LIST(erratum_1418040_list),
> .type = (ARM64_CPUCAP_SCOPE_LOCAL_CPU |
> + ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU |
> ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU),
> },
> #endif
>
Yes, this works.
Thanks,
Sai
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member
of Code Aurora Forum, hosted by The Linux Foundation
More information about the linux-arm-kernel
mailing list