[PATCH v2] arm64: errata: Workaround for SI L1 downstream coherency issue

Robin Murphy robin.murphy at arm.com
Thu Jan 8 08:41:47 PST 2026


On 2026-01-08 3:18 pm, Will Deacon wrote:
> On Wed, Jan 07, 2026 at 05:55:40PM +0000, Robin Murphy wrote:
>> On 2026-01-07 4:33 pm, Will Deacon wrote:
>>> On Thu, Jan 01, 2026 at 06:55:05PM +0000, Marc Zyngier wrote:
>>>> The other elephant in the room is virtualisation: how does a guest
>>>> performing CMOs deals with this? How does it discover the that the
>>>> host is broken? I also don't see any attempt to make KVM handle the
>>>> erratum on behalf of the guest...
>>>
>>> A guest shouldn't have to worry about the problem, as it only affects
>>> clean to PoC for non-coherent DMA agents that reside downstream of the
>>> SLC in the interconnect. Since VFIO doesn't permit assigning
>>> non-coherent devices to a guest, guests shouldn't ever need to push
>>> writes that far (and FWB would cause bigger problems if that was
>>> something we wanted to support)
>>>
>>> +Mostafa to keep me honest on the VFIO front.
>>
>> I don't think we actually prevent non-coherent devices being assigned, we
>> just rely on the IOMMU supporting IOMMU_CAP_CACHE_COHERENCY. Thus if there's
>> an I/O-coherent SMMU then it could end up being permitted, however I would
>> hope that either the affected devices are not behind such an SMMU, or at
>> least that if the SMMU imposes cacheable attributes then that prevents
>> traffic from taking the back-door path to RAM.
> 
> I think IOMMU_CAP_CACHE_COHERENCY is supposed to indicate whether or not
> the endpoint devices are coherent (i.e. whether IOMMU_CACHE makes sense)
> but it's true that, for the SMMU, we tie this to the coherency of the
> SMMU itself so it is a bit sketchy. There's an interesting thread between
> Mostafa and Jason about it:
> 
> https://lore.kernel.org/all/ZtHhdj6RAKACBCUG@google.com/

The point is that if there's a coherent interconnect downstream of the 
SMMU - which we infer from the SMMU's own coherency - then we should be 
able to make the *output* of SMMU translation coherent, regardless of 
what the incoming attributes from the device are. In the IORT terms, CPM 
is really what matters for IOMMU_CACHE, not DACS.

> But, that aside, FWB throws a pretty big spanner in the works if we want
> to assign non-coherent devices.

If you mean having FWB on the CPU *without* also having it on the SMMU, 
then yes, there are various ways that could be problematic even with 
nominally-coherent devices. S2FWB on the SMMU, however, is *almost* the 
magic bullet that makes things fine for VFIO in general, except for the 
annoying mis-step that it's not guaranteed to override PCIe No Snoop 
(hopefully that might get fixed in future, but we'll still have today's 
implementations that do have the not-particularly-useful behaviour.)

This may be straying a bit far off $SUBJECT though - do we know if the 
affected devices in this case are behind a coherent SMMU, and how things 
work for IWB-OWB-ISh output attributes if so?

Thanks,
Robin.



More information about the linux-arm-kernel mailing list