[PATCH v4 14/27] iommu/arm-smmu-v3: Make changing domains be hitless for ATS
Shameerali Kolothum Thodi
shameerali.kolothum.thodi at huawei.com
Wed Jan 31 03:20:48 PST 2024
> -----Original Message-----
> From: Jason Gunthorpe <jgg at nvidia.com>
> Sent: Friday, January 26, 2024 6:15 PM
> To: iommu at lists.linux.dev; Joerg Roedel <joro at 8bytes.org>; linux-arm-
> kernel at lists.infradead.org; Robin Murphy <robin.murphy at arm.com>; Will
> Deacon <will at kernel.org>
> Cc: Eric Auger <eric.auger at redhat.com>; Jean-Philippe Brucker <jean-
> philippe at linaro.org>; Moritz Fischer <mdf at kernel.org>; Michael Shavit
> <mshavit at google.com>; Nicolin Chen <nicolinc at nvidia.com>;
> patches at lists.linux.dev; Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi at huawei.com>
> Subject: [PATCH v4 14/27] iommu/arm-smmu-v3: Make changing domains be
> hitless for ATS
>
> The core code allows the domain to be changed on the fly without a forced
> stop in BLOCKED/IDENTITY. In this flow the driver should just continually
> maintain the ATS with no change while the STE is updated.
>
> ATS relies on a linked list smmu_domain->devices to keep track of which
> masters have the domain programmed, but this list is also used by
> arm_smmu_share_asid(), unrelated to ats.
>
> Create two new functions to encapsulate this combined logic:
> arm_smmu_attach_prepare()
> <caller generates and sets the STE>
> arm_smmu_attach_commit()
>
> The two functions can sequence both enabling ATS and disabling across
> the STE store. Have every update of the STE use this sequence.
>
> Installing a S1/S2 domain always enables the ATS if the PCIe device
> supports it.
>
> The enable flow is now ordered differently to allow it to be hitless:
>
> 1) Add the master to the new smmu_domain->devices list
> 2) Program the STE
> 3) Enable ATS at PCIe
> 4) Remove the master from the old smmu_domain
>
> This flow ensures that invalidations to either domain will generate an ATC
> invalidation to the device while the STE is being switched. Thus we don't
> need to turn off the ATS anymore for correctness.
>
> The disable flow is the reverse:
> 1) Disable ATS at PCIe
> 2) Program the STE
> 3) Invalidate the ATC
> 4) Remove the master from the old smmu_domain
>
> Move the nr_ats_masters adjustments to be close to the list
> manipulations. It is a count of the number of ATS enabled masters
> currently in the list. This is stricly before and after the STE/CD are
> revised, and done under the list's spin_lock.
This part 2 fails to boot in my hardware setup and the bisect points to
this patch here.
Splat:
31.988204] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
...
[ 32.357876] Call trace:
[ 32.363018] arm_smmu_attach_commit+0xa4/0x210
[ 32.372361] arm_smmu_attach_dev+0x208/0x3b4
[ 32.381344] __iommu_device_set_domain+0x7c/0x11c
[ 32.391243] __iommu_group_set_domain_internal+0x90/0x184
[ 32.402601] iommu_setup_default_domain+0x32c/0x4e8
[ 32.412863] __iommu_probe_device+0x450/0x478
...
> Signed-off-by: Jason Gunthorpe <jgg at nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 199 ++++++++++++++------
> 1 file changed, 141 insertions(+), 58 deletions(-)
>
> +static void arm_smmu_attach_commit(struct arm_smmu_master *master,
> + struct attach_state *state)
> +{
> + lockdep_assert_held(&arm_smmu_asid_lock);
> +
> + if (state->want_ats && !master->ats_enabled) {
> + arm_smmu_enable_ats(master);
> + } else if (master->ats_enabled) {
> + /*
> + * The translation has changed, flush the ATC. At this point the
> + * SMMU is translating for the new domain and both the
> old&new
> + * domain will issue invalidations.
> + */
> + arm_smmu_atc_inv_master(master);
> + }
> + master->ats_enabled = state->want_ats;
> +
> + arm_smmu_remove_master_domain(master,
> + iommu_get_domain_for_dev(master-
> >dev));
I think the core has not yet set the group->domain since we are still in attach().
Thanks,
Shameer
More information about the linux-arm-kernel
mailing list