[PATCH 12/19] iommu/arm-smmu-v3: Put writing the context descriptor in the right order

Thu Oct 12 05:34:31 PDT 2023

On Thu, Oct 12, 2023 at 05:01:16PM +0800, Michael Shavit wrote:
> On Wed, Oct 11, 2023 at 8:33 AM Jason Gunthorpe <jgg at nvidia.com> wrote:
> > If we are replacing a CD table entry when the STE already points at the CD
> > entry then we just need to do the make/break sequence.
> 
> Do you mean when the STE already points at the CD table? 

Yes

> What's the make/break sequence?

When replacing a CD table entry at this point the code makes the CD
table entry non-valid then immediately makes it valid. This is because
the CD code cannot (yet, ~10 patches later it does) handle a Valid to
Valid transition.

> > +               } else {
> > +                       /*
> > +                        * arm_smmu_write_ctx_desc() relies on the entry being
> > +                        * invalid to work, clear any existing entry.
> > +                        */
> > +                       ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID,
> > +                                                     NULL);
> > +                       if (ret) {
> > +                               master->domain = NULL;
> > +                               goto out_list_del;
> > +                       }
> >                 }
> >
> >                 ret = arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, &smmu_domain->cd);
> > @@ -2563,15 +2566,23 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> >                 }
> >
> >                 arm_smmu_make_cdtable_ste(&target, master, &master->cd_table);
> > +               arm_smmu_install_ste_for_dev(master, &target);
> 
> Even if it's handled correctly under the hood by clever ste writing
> logic, isn't it weird that we don't explicitly check whether the CD
> table is already installed and skip arm_smmu_install_ste_for_dev in
> that case?

There is a design logic at work here..

At this layer in the code we think in terms of 'target state'. We know
what the correct STE must be, so we compute that full value and make
the HW use that value. The lower layer computes the steps required to
put the HW into the target state, which might be a NOP.

Trying to optimizing the NOP here means this layer has to keep track
of what state the STE is currently in vs only tracking what state it
should be in. Avoiding that tracking is a main point of the new
programming logic.

This is a pretty common design pattern, "desired state" or "target
state".

Later on this becomes more complex as the CD table may be installed to
the STE but the S1DSS or EATS is not correct for S1 operation. Coding
it this way eventually trivially corrects those things as well. That
is something like 30 patches later.

Regards,
Jason