[ARM IOMMU] IOMMU framework concurrency issue

Zhenhua Huang quic_zhenhuah at quicinc.com
Tue Oct 17 04:10:23 PDT 2023


Dear experts,

Saw a few crashes in our projects because of concurrency between (1) and 
(2).

bus notifier or bus_iommu_probe:
	__iommu_probe_device
		acquire iommu_probe_device_lock only
		iommu_init_device()
			//touch dev->iommu (1)
			dev_iommu_get()
			->probe_device()

Client device probing path:
of_dma_configure			
	of_iommu_configure
		//touch dev->iommu  (2)
		...


We already have 01657bc14a39 ("iommu: Avoid races around device probe") 
and the big comment in __iommu_probe_device() refers to adopt 
device_lock() further. Notice your big effort to utilize it, and IMO it 
can address above issue(which protects dev->iommu):
https://lore.kernel.org/all/0-v2-d2762acaf50a+16d-iommu_group_locking2_jgg@nvidia.com/T/#md11b80c9e5c90ab97904f1bddecc21d97c296c0d
But from your discussion it was pointed out "which already violates 
*other* IOMMU API assumptions", because of some drivers directly called 
of_dma_configure() w/o acquiring device_lock().

Could you please provide your points of view how to address this further?

One example of crash logs, checking dump found dev->iommu already freed:
[ 0.898000][ T77] Unable to handle kernel paging request at virtual 
address 70c7e81d817765bc
…
[ 0.898219][ T77] pc : iommu_fwspec_init+0x30/0xc4
[ 0.898223][ T77] lr : of_iommu_xlate+0x58/0xe0
…
[ 0.898252][ T77] Call trace:
[ 0.898253][ T77] iommu_fwspec_init+0x30/0xc4
[ 0.898255][ T77] of_iommu_xlate+0x58/0xe0
[ 0.898257][ T77] of_iommu_configure+0x16c/0x224
[ 0.898260][ T77] of_dma_configure_id+0x1cc/0x244
[ 0.898265][ T77] platform_dma_configure+0x34/0x80
[ 0.898267][ T77] really_probe+0x110/0x384
[ 0.898271][ T77] __driver_probe_device+0xb4/0xe4
[ 0.898274][ T77] driver_probe_device+0x44/0x210
[ 0.898277][ T77] __device_attach_driver+0x144/0x170
[ 0.898280][ T77] bus_for_each_drv+0x9c/0xec
[ 0.898282][ T77] __device_attach_async_helper+0x78/0xd0
[ 0.898285][ T77] async_run_entry_fn+0x44/0x118
[ 0.898287][ T77] process_one_work+0x1e4/0x43c
[ 0.898290][ T77] worker_thread+0x25c/0x430
[ 0.898293][ T77] kthread+0x104/0x1d4
[ 0.898295][ T77] ret_from_fork+0x10/0x20

Thanks,
Zhenhua



More information about the linux-arm-kernel mailing list