[PATCH 2/2] arm64: Add workaround for Arm Cortex-A77 erratum 1508412

James Morse james.morse at arm.com
Wed Jul 1 08:00:59 EDT 2020


Hi guys,

On 30/06/2020 09:36, Will Deacon wrote:
> On Tue, Jun 30, 2020 at 09:15:15AM +0100, Marc Zyngier wrote:
>> On 2020-06-29 22:33, Rob Herring wrote:
>>> On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device
>>> load
>>> and a store exclusive or PAR_EL1 read can cause a deadlock.
>>>
>>> The workaround requires a DMB SY before and after a PAR_EL1 register
>>> read
>>> and the disabling of KVM. KVM must be disabled to prevent the
>>> problematic
>>> sequence in guests' EL1. This workaround also depends on a firmware
>>> counterpart to enable the h/w to insert DMB SY after load and store
>>> exclusive instructions. See the errata document SDEN-1152370 v10 [1] for
>>> more information.
> 
> Jose -- having an SMC interface to see if the firmware is holding up its
> side of the bargian would be really helpful here. There's been one in
> development for _months_ now; any update?
> 
>> This seems a bit extreme. Given that this CPU is most likely
>> used in big-little systems, there is still a bunch of CPUs
>> on which we could reliably execute guests.

(I'm guessing you don't want KVM to second guess the scheduler's placement on big.little
systems?)


>> It is also likely that people could run trusted guests.

Knowing whether the user trusts the guest not to tickle this is the piece of information
that would change what we do here.


>> I would suggest printing a big fat warning and taining the
>> kernel with TAINT_CPU_OUT_OF_SPEC, together with the required
>> DSBs in the KVM code.
> 
> Honestly, I think a TAINT is pointless here and we shouldn't be in the
> business of trying to police what people do with their systems when there's
> absolutely nothing we can do to help them. After all, they can always
> disable KVM themselves if they want to. The only sensible action you can
> take on seeing the taint is to disable the workaround to get rid of it,
> which is also the worst thing you can do! As another example, imagine if
> we had the ability to detect whether or not firmware was setting the patch
> registers. If we knew that it wasn't applying the workaround, would we
> TAINT on entering userspace? I don't think so. We'd probably just print a
> message when trying to apply the workaround, indicating that it was
> incomplete and the system may deadlock.
> 
> Finally, we have another erratum that allows guests to deadlock the system
> (Cortex-A57 832075)

Aha! Precedent.

We don't print any warning about untrusted guests in that case.


> so ultimately it's up to the person deploying the system
> to decide whether or not they can tolerate the risk of deadlock. In many
> cases, it won't be an issue, but if it is and they require KVM, then the
> part is dead in the water and Linux can't help with that.

Sure. So the plan here is to add the barriers around KVMs PAR_EL1 accesses, and get KVM to
print a warning that this platform is only suitable for trusted guests? (and do that for
A57's 832075 too).
As its a deadlock, not the guest influence/corrupting the host, I think this is fine. Not
printing a warning implies we hope anyone deploying KVM on affected silicon has read the
errata document...


Thanks,

James



More information about the linux-arm-kernel mailing list