[ARM IOMMU] IOMMU framework concurrency issue

Jason Gunthorpe jgg at ziepe.ca
Tue Oct 17 09:33:37 PDT 2023


On Tue, Oct 17, 2023 at 07:10:23PM +0800, Zhenhua Huang wrote:
> Dear experts,
> 
> Saw a few crashes in our projects because of concurrency between (1) and
> (2).
> 
> bus notifier or bus_iommu_probe:
> 	__iommu_probe_device
> 		acquire iommu_probe_device_lock only
> 		iommu_init_device()
> 			//touch dev->iommu (1)
> 			dev_iommu_get()
> 			->probe_device()
> 
> Client device probing path:
> of_dma_configure			
> 	of_iommu_configure
> 		//touch dev->iommu  (2)
> 		...
> 
> 
> We already have 01657bc14a39 ("iommu: Avoid races around device probe") and
> the big comment in __iommu_probe_device() refers to adopt device_lock()
> further. Notice your big effort to utilize it, and IMO it can address above
> issue(which protects dev->iommu):

I think something else has gone wrong here, you should not be able to
get to any really_probe() before the iommu side has done its part.

Even with proper device locking the poor device that is racing isn't
going to work properly as the IOMMU won't be guarenteed to be
consistently configured.

This seems like you need to resolve boot time ordering in your
platform? (I don't know exactly how ARM works here though)

eg make sure the iommu driver is fully registered before allowing any
concurrent probes. Once the iommu driver is registered it will be able
to catch the bus notifiers and serialize things properly.

Jason



More information about the linux-arm-kernel mailing list