[PATCH v2 2/8] iommu/arm-smmu-v3: Use S2FWB when available
Jason Gunthorpe
jgg at nvidia.com
Mon Sep 2 17:05:46 PDT 2024
On Mon, Sep 02, 2024 at 09:29:53AM +0000, Mostafa Saleh wrote:
> On Fri, Aug 30, 2024 at 01:40:19PM -0300, Jason Gunthorpe wrote:
> > On Fri, Aug 30, 2024 at 03:12:54PM +0000, Mostafa Saleh wrote:
> > > > + /*
> > > > + * If for some reason the HW does not support DMA coherency then using
> > > > + * S2FWB won't work. This will also disable nesting support.
> > > > + */
> > > > + if (FIELD_GET(IDR3_FWB, reg) &&
> > > > + (smmu->features & ARM_SMMU_FEAT_COHERENCY))
> > > > + smmu->features |= ARM_SMMU_FEAT_S2FWB;
> > > I think that’s for the SMMU coherency which in theory is not related to the
> > > master which FWB overrides, so this check is not correct.
> >
> > Yes, I agree, in theory.
> >
> > However the driver today already links them together:
> >
> > case IOMMU_CAP_CACHE_COHERENCY:
> > /* Assume that a coherent TCU implies coherent TBUs */
> > return master->smmu->features & ARM_SMMU_FEAT_COHERENCY;
> >
> > So this hunk was a continuation of that design.
> >
> > > What I meant in the previous thread that we should set FWB only for coherent
> > > masters as (in attach s2):
> > > if (smmu->features & ARM_SMMU_FEAT_S2FWB && dev_is_dma_coherent(master->dev)
> > > // set S2FWB in STE
> >
> > I think as I explained in that thread, it is not really correct
> > either. There is no reason to block using S2FWB for non-coherent
> > masters that are not used with VFIO. The page table will still place
> > the correct memattr according to the IOMMU_CACHE flag, S2FWB just
> > slightly changes the encoding.
>
> It’s not just the encoding that changes, as
> - Without FWB, stage-2 combine attributes
> - While with FWB, it overrides them.
You mean there is some incomming attribute in the transaction
(obviously not talking PCI here) and S2FWB combines with that?
> So a cacheable mapping in stage-2 can lead to a non-cacheable
> (or with different cachableitiy attributes) transaction based on the
> input. I am not sure though if there is such case in the kernel.
If the kernel supplies IOMMU_CACHE then the kernel also skips all the
cache flushing. So it would be a functional problem if combining was
causing a non-cachable access through a IOMMU_CACHE S2 already. The
DMA API would fail if that was the case.
> > If anything should be changed then it would be the above
> > IOMMU_CAP_CACHE_COHERENCY test, and I don't know if
> > dev_is_dma_coherent() would be correct there, or if it should do some
> > ACPI inspection or what.
>
> I agree, I believe that this assumption is not accurate, I am not sure
> what is the right approach here, but in concept I think we shouldn’t
> enable FWB for non-coherent devices (using dev_is_dma_coherent() or
> other check)
The DMA API requires that the cachability rules it sets via
IOMMU_CACHE are followed. In this way the stricter behavior of S2FWB
is a benefit, not a draw back.
I'm still not seeing a problm here??
Jason
More information about the linux-arm-kernel
mailing list