[PATCH v5 01/17] iommu/arm-smmu-v3: Make STE programming independent of the callers

Thu Feb 22 09:43:46 PST 2024

On Wed, Feb 21, 2024 at 10:08:18AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 21, 2024 at 01:49:23PM +0000, Will Deacon wrote:
> 
> > Very roughly, yes, although I'd go further and just return a bitmap of
> > used qwords instead of tracking these bits. Basically, we could have some
> > #defines saying which qwords are used by which configs, 
> 
> I don't think this will work well for CD's EPD0 case..
> 
> static void arm_smmu_get_cd_used(const __le64 *ent, __le64 *used_bits)
> {
> 	used_bits[0] = cpu_to_le64(CTXDESC_CD_0_V);
> 	if (!(ent[0] & cpu_to_le64(CTXDESC_CD_0_V)))
> 		return;
> 	memset(used_bits, 0xFF, sizeof(struct arm_smmu_cd));
> 
> 	/* EPD0 means T0SZ/TG0/IR0/OR0/SH0/TTB0 are IGNORED */
> 	if (ent[0] & cpu_to_le64(CTXDESC_CD_0_TCR_EPD0)) {
> 		used_bits[0] &= ~cpu_to_le64(
> 			CTXDESC_CD_0_TCR_T0SZ | CTXDESC_CD_0_TCR_TG0 |
> 			CTXDESC_CD_0_TCR_IRGN0 | CTXDESC_CD_0_TCR_ORGN0 |
> 			CTXDESC_CD_0_TCR_SH0);
> 		used_bits[1] &= ~cpu_to_le64(CTXDESC_CD_1_TTB0_MASK);
> 	}
> }

Please can you explain more about the issue here? I know what EPDx are,
but I'm not understanding why they're problematic. This presumably
involves a hitless transition to/from an aborting CD?

> > and then we can
> > simplify the algorithm while retaining the ability to reject updates
> > to qwords which we're not expecting.
> 
> It is not much simplification. arm_smmu_entry_qword_diff() gets a bit
> shorter (not that it is complex anyhow) and other stuff gets worse.
> 
> > > We'd have to really start doing really hacky things like remove the
> > > SHCFG as a used field entirely - but I think if you do that you break
> > > the entire logic of the design and also go backwards to having
> > > programming that only works if STEs are constructed in certain ways.
> > 
> > I would actually like to remove SHCFG as a used field. If the encoding
> > was less whacky (i.e. if 0b00 always meant "use incoming"), then it would
> > be easy, but it shouldn't be too hard to work around that.
> 
> But why?
> 
> You throw away the entire logic of the design, go back to subtly
> coupling the two parts, and *for what*? Exactly what are we trying to
> achieve in return? You haven't explained why we are still discussing
> this afer 7 months. It really isn't worthwhile.

I'm just trying to avoid introducing dynamic behaviours to the driver
which aren't actually used, and per-qword tracking feels like an easier
way to maintain the hitless updates for the cases you care about. It's
really not about throwing away the entire logic of the design -- as I
said, I think this is looking pretty good. I'm also absolutely open to
being convinced that per-field makes more sense and per-qword is terrible,
so I'd really like to understand the E0PD case more.

As an aside: is this per-field/per-qword discussion the only thing holding
up a v6? With the rest of the feedback addressed and a version of Michael's
selftest that exercises stage-2 translating domains, I'd like to think
we could get it queued up soon.

Cheers,

Will