[PATCH v7 5/9] iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()

Jason Gunthorpe jgg at nvidia.com
Mon Apr 22 07:20:53 PDT 2024


On Fri, Apr 19, 2024 at 09:14:21PM +0000, Mostafa Saleh wrote:
> Hi Jason,
> 
> On Tue, Apr 16, 2024 at 04:28:16PM -0300, Jason Gunthorpe wrote:
> > Only the attach callers can perform an allocation for the CD table entry,
> > the other callers must not do so, they do not have the correct locking and
> > they cannot sleep. Split up the functions so this is clear.
> > 
> > arm_smmu_get_cd_ptr() will return pointer to a CD table entry without
> > doing any kind of allocation.
> > 
> > arm_smmu_alloc_cd_ptr() will allocate the table and any required
> > leaf.
> > 
> > A following patch will add lockdep assertions to arm_smmu_alloc_cd_ptr()
> > once the restructuring is completed and arm_smmu_alloc_cd_ptr() is never
> > called in the wrong context.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg at nvidia.com>
> > ---
> >  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 61 +++++++++++++--------
> >  1 file changed, 39 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > index f3df1ec8d258dc..a0d1237272936f 100644
> > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > @@ -98,6 +98,7 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
> >  
> >  static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> >  				    struct arm_smmu_device *smmu);
> > +static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
> >  
> >  static void parse_driver_options(struct arm_smmu_device *smmu)
> >  {
> > @@ -1207,29 +1208,51 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
> >  struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
> >  					u32 ssid)
> >  {
> > -	__le64 *l1ptr;
> > -	unsigned int idx;
> >  	struct arm_smmu_l1_ctx_desc *l1_desc;
> > -	struct arm_smmu_device *smmu = master->smmu;
> >  	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> >  
> > +	if (!cd_table->cdtab)
> > +		return NULL;
> > +
> >  	if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
> >  		return (struct arm_smmu_cd *)(cd_table->cdtab +
> >  					      ssid * CTXDESC_CD_DWORDS);
> >  
> > -	idx = ssid >> CTXDESC_SPLIT;
> > -	l1_desc = &cd_table->l1_desc[idx];
> > -	if (!l1_desc->l2ptr) {
> > -		if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
> > -			return NULL;
> > +	l1_desc = &cd_table->l1_desc[ssid / CTXDESC_L2_ENTRIES];
> 
> These operations used to be shift and bit masking which made sense as it does
> what hardware does, is there any reason you changed it to division and modulo?
> I checked the disassembly and gcc does the right thing as constants are power
> of 2, but I am just curious.

I generally prefer the clarity and succinctness of / and % instead of
hacking up bit operations that the compiler will generate
automatically anyhow.

If bit extractions should be used it is better to wrap it in
FIELD_GET() than open code it..

> > +static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
> > +						 u32 ssid)
> > +{
> > +	struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> > +	struct arm_smmu_device *smmu = master->smmu;
> > +
> > +	if (!cd_table->cdtab) {
> > +		if (arm_smmu_alloc_cd_tables(master))
> > +			return NULL;
> >  	}
> > -	idx = ssid & (CTXDESC_L2_ENTRIES - 1);
> > -	return &l1_desc->l2ptr[idx];
> > +
> > +	if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_64K_L2) {
> > +		unsigned int idx = ssid >> CTXDESC_SPLIT;
> 
> Ok, now it’s a shift, I think we should be consistent with how we
> calculate the index.

Sure. Change that to / will make CTXDESC_SPLIT unused except in
computing CTXDESC_L2_ENTRIES so that can be simplified too:

-#define CTXDESC_SPLIT                  10
-#define CTXDESC_L2_ENTRIES             (1 << CTXDESC_SPLIT)
+#define CTXDESC_L2_ENTRIES             1024


> > @@ -1357,7 +1380,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
> >  	if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
> >  		return -E2BIG;
> >  
> > -	cd_table_entry = arm_smmu_get_cd_ptr(master, ssid);
> > +	cd_table_entry = arm_smmu_alloc_cd_ptr(master, ssid);
> 
> The only path allocates the main table is “arm_smmu_attach_dev”,

There are two places that allocate the leaf, arm_smmu_attach_dev()
(for the RID) and arm_smmu_sva_set_dev_pasid() (for a PASID)

At this moment all the paths are relying on the above to allocate the
leaf. The next patch makes arm_smmu_attach_dev() allocate the leaf
itself. A few more patches also makes the PASID path allocate the leaf
itself, when the above is removed.

> I guess it would be more robust to leave that as is and have 2
> versions of get_cd, one that allocates leaf and one that is not
> allocating, what do you think?

I'm not sure what you are asking? We have two versions. One is called
alloc and one is called get. That have different locking requirements
on the caller so they have different names. I would not call them both
get?

Thanks,
Jason



More information about the linux-arm-kernel mailing list