[PATCH v2 1/1] iommu/arm-smmu-v3: Fix error case of range command

Will Deacon will at kernel.org
Wed Aug 9 04:23:18 PDT 2023


On Wed, Aug 09, 2023 at 05:22:06PM +0800, zhurui wrote:
> On 2023/8/9 0:43, Robin Murphy wrote:
> > On 08/08/2023 5:24 pm, Will Deacon wrote:
> >> On Mon, Aug 07, 2023 at 08:20:45PM +0100, Robin Murphy wrote:
> >>> Yeah, I'd rather not downgrade to a non-range invalidate since that
> >>> complicates the reasoning for the errata affecting those. If the size of the
> >>> invalidation is equal to TG then it can only represent a single last-level
> >>> page, i.e. TTL=3, thus if it does warrant handling here then indeed
> >>> rearranging to base the condition on num_pages as well ought to suffice.
> >>> However, this is all still begging the question of where and why we're doing
> >>> a *non-leaf* invalidation that isn't aligned to the size of a table, because
> >>> that in itself doesn't make a whole heap of sense - my hunch is that that
> >>> wants figuring out and could probably be fixed at the source.
> >>
> >> Isn't that described above because we're using CMDQ_TLBI_RANGE_NUM_MAX
> >> to break up the range into separate commands?
> > 
> > Not really, because if we're doing a genuine non-leaf invalidation of a
> > table then it should be a block-aligned range that ought to fit in a
> > single command and should certainly never involve a single-granule
> > remainder. If we're doing non-leaf invalidations of things that
> > logically don't need to be non-leaf, making them leaf would be the even
> > better option.
> > 
> 
> I agree with Robin that if the caller is doing a genuine non-leaf invalidation
> of a table, it should not involve a single-granule tlbi. It seems that the
> caller only filter the block size, but not the address aligned or not maybe.

There's only one caller though, right? That's the
io_pgtable_tlb_flush_walk() call in io-pgtable-arm.c which shouldn't trigger
this problem.

Do you have a backtrace for the case when we generate the illegal command?

> >> Do you mind if I queue the patch as-is for now? I don't think the driver
> >> should be emitting illegal commands, and v2 of the patch does seem like
> >> the obvious thing to do.
> > 
> > TBH I'd rather you just drop my patch if it's proven problematic, and
> > I'll take another crack at it soon. The potential problems we introduce
> > by using non-range invalidates on errata-affected MMU-700 revisions are
> > worse than the almost-entirely-theoretical one I was trying to address.
> > 
> 
> If you all agree to roll back the problematic code, is the first patch be OK?
> Should I need to add some more descriptions to clarify this?

I can just go and revert Robin's original patch, but I'd like to see your
backtrace first so that we understand how this is occurring.

Thanks,

Will



More information about the linux-arm-kernel mailing list