[RFC PATCH 0/3] arm64: errata: Disable FWB on parts with non-ARM interconnects

James Morse james.morse at arm.com
Tue Feb 21 09:41:35 PST 2023


Hi Oliver,

On 16/02/2023 18:52, Oliver Upton wrote:
> On Thu, Feb 16, 2023 at 06:21:58PM +0000, James Morse wrote:
>> When stage1 translation is disabled, the SCTRL_E1.I bit controls the
>> attributes used for instruction fetch, one of the options results in a
>> non-cacheable access. A whole host of CPUs missed the FWB override
>> in this case, meaning a KVM guest could fetch stale/junk data instead of
>> instructions.
>>
>> The workaround is to disable FWB, and do the required cache maintenance
>> instead.
>>
>> The good news is, this isn't a problem for systems using Arm's
>> interconnect IP. The bad news is: linux can't know this. Arm knows of
>> at least one platform that is affected by this erratum.
>>
>>
>> This series adds support for the 'Errata Management Firmware Interface', [0]
>> and queries that to determine if the CPU is affected or not.
>>
>> Unfortunately, no-one has firmware that supports this new interface yet,
>> and the least surprising thing to do is to enable the workaround by default,
>> meaning FWB is disabled on all these cores, even for unaffected platforms.
>> Platforms that are not-affected can either take a firmware-update to support
>> the interface, or if the kernel they run will only run on hardware that is
>> unaffected, disable the workaround at build time.

> Wait, what? Is there a legitimate concern that affected systems are in
> the wild today, or is there enough time for affected platforms to go and
> implement the necessary firmware interface?

The one platform that arm is aware of isn't shipping yet - I assume it will implement the
firmware interface.

But I don't think arm always know what it is people are building ... it certainly doesn't
reach me. This affects a whole host of CPUs, I wouldn't be surprised if there is an
existing part out there that is affected.


> Requiring correctly
> implemented systems to explicitly opt-out seems like quite a lot more
> work (w/ low likelihood) than having the one known platform go about
> this the right way.

Sure, but its safe by default.


> I'm rather troubled by the idea of enabling this by default on systems
> that use these cores unless there really is no opportunity to
> course-correct.

It's the choice between correctness and performance. Probability says unless the CPU is
Neoverse-V2 (which is that one platform), you're not affected. But how much does
correctness matter? I'd hate to have to debug "1 in a 100 times the guest doesn't boot".


Thanks,

James



More information about the linux-arm-kernel mailing list