[PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
Jason Gunthorpe
jgg at nvidia.com
Tue May 19 05:07:37 PDT 2026
On Mon, May 18, 2026 at 08:38:54PM -0700, Nicolin Chen wrote:
> +void iommu_report_device_broken(struct device *dev)
> +{
> + struct group_device *gdev;
> +
> + /*
> + * We cannot hold group->mutex here. Rely on iommu_group_broken_worker()
> + * to validate dev_has_iommu(). The iommu_group memory is RCU-protected
> + * via kfree_rcu() in iommu_group_release(), and group->devices is an
> + * RCU-protected list, so the lookup runs entirely under rcu_read_lock.
> + *
> + * Note the device might have been concurrently removed from the group
> + * (list_del_rcu) before iommu_deinit_device() cleared the dev->iommu.
> + */
> + rcu_read_lock();
> + gdev = __dev_to_gdev_rcu(dev);
> + if (gdev) {
If this is why the RCU is being added it seems like overkill.
Just add the worker to struct dev_iommu and push it there so it can
use a mutex but I'm confused why are we even adding this function?
The entire design of this series was supposed to have the IOMMU driver
itself adjust it's "STE" to inhibit translated TLPs synchronosly
within its fully locked invalidation loop.
Whats the async worker for?
Jason
More information about the linux-arm-kernel
mailing list