[PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path

Robin Murphy robin.murphy at arm.com
Thu Apr 24 06:58:32 PDT 2025


On 15/04/2025 4:08 pm, Johan Hovold wrote:
> On Mon, Apr 14, 2025 at 04:37:59PM +0100, Robin Murphy wrote:
>> On 2025-04-11 9:02 am, Johan Hovold wrote:
>>> On Fri, Feb 28, 2025 at 03:46:33PM +0000, Robin Murphy wrote:
> 
>>>> @@ -155,7 +155,12 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
>>>>    		dev_iommu_free(dev);
>>>>    	mutex_unlock(&iommu_probe_device_lock);
>>>>    
>>>> -	if (!err && dev->bus)
>>>> +	/*
>>>> +	 * If we're not on the iommu_probe_device() path (as indicated by the
>>>> +	 * initial dev->iommu) then try to simulate it. This should no longer
>>>> +	 * happen unless of_dma_configure() is being misused outside bus code.
>>>> +	 */
>>>
>>> This assumption does not hold as there is nothing preventing iommu
>>> driver probe from racing with a client driver probe.
>>
>> Not sure I follow - *this* assumption is that if we arrived here with
>> dev->iommu already allocated then __iommu_probe_device() is already in
>> progress for this device, either in the current callchain or on another
>> thread, and so we can (and should) skip calling into it again. There's
>> no ambiguity about that.
> 
> I was referring to the this "should no longer happen unless
> of_dma_configure() is being misused outside bus code" claim, which
> appears to be false given the splat below.

That's not an assumption so much as a statement of intent. And really 
it's the other way round anyway, as a reminder that this replay call is 
only still here (unlike in the ACPI path) because there *is* still 
plenty of sketchy usage of of_dma_configure() which I'm wary of breaking.
>>>> +	if (!err && dev->bus && !dev_iommu_present)
>>>>    		err = iommu_probe_device(dev);
>>>>    
>>>>    	if (err && err != -EPROBE_DEFER)
>>>
>>> I hit the (now moved) dev_WARN() on the ThinkPad T14s where the GPU SMMU
>>> is probed late due to a clock dependency and can end up probing in
>>> parallel with the GPU driver.
>>
>> And what *should* happen is that the GPU driver probe waits for the
>> IOMMU driver probe to finish. Do you have fw_devlink enabled?
> 
> Yes, but you shouldn't rely on devlinks for correctness.
> 
> That said it does seem like something is not working as you think it is
> here, and indeed the iommu supplier link is not created until SMMUv2
> probe_device() (see arm_smmu_probe_device()).
> 
> So client devices can start to be probed (bus dma_configure() is called)
> before their iommu is ready also with devlinks enabled (and I do see
> this happen on every boot).

I didn't mean the explicit power management links created by the SMMU 
driver itself, I meant the fwnode links created automatically by 
fw_devlink_link_device() at device_add() time, which infer a 
supplier-consumer relationship from the "iommus" DT property, wherein 
device_links_check_suppliers() would then defer the GPU driver probe 
until the SMMU driver probe has completed successfully probing and 
called device_links_driver_bound().

Except it turns out that "iommus" is one of the optional properties 
which are only linked that way under "fw_devlink=strict", so that 
explains that, fair enough.
>>> [    3.805282] arm-smmu 3da0000.iommu: probing hardware configuration...
> 
>>> [    3.829042] platform 3d6a000.gmu: Adding to iommu group 8
>>>
>>> [    3.992050] ------------[ cut here ]------------
>>> [    3.993045] adreno 3d00000.gpu: late IOMMU probe at driver bind, something fishy here!
>>> [    3.994058] WARNING: CPU: 9 PID: 343 at drivers/iommu/iommu.c:579 __iommu_probe_device+0x2b0/0x4ac
>>>
>>> [    4.003272] CPU: 9 UID: 0 PID: 343 Comm: kworker/u50:2 Not tainted 6.15.0-rc1 #109 PREEMPT
>>> [    4.003276] Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET83W (2.13 ) 10/04/2024
>>>
>>> [    4.025943] Call trace:
>>> [    4.025945]  __iommu_probe_device+0x2b0/0x4ac (P)
>>> [    4.030453]  iommu_probe_device+0x38/0x7c
>>> [    4.030455]  of_iommu_configure+0x188/0x26c
>>> [    4.030457]  of_dma_configure_id+0xcc/0x300
>>> [    4.030460]  platform_dma_configure+0x74/0xac
>>> [    4.030462]  really_probe+0x74/0x38c
>>
>> Indeed this is exactly what is *not* supposed to be happening - does
>> this patch help at all?
>>
>> https://lore.kernel.org/linux-iommu/09d901ad11b3a410fbb6e27f7d04ad4609c3fe4a.1741706365.git.robin.murphy@arm.com/
> 
> I've only seen that splat once so far so I don't have a reliable
> reproducer.
> 
> But AFAICS that patch won't help help here where we appear to have iommu
> probe racing with bus dma_configure() called from really_probe() for the
> client device.

Well, tightening up __iommu_probe_device() would stand to slightly 
reduce the window in general while bus_set_iommu() is running. However 
you're right that this is a different race from the ones implicated 
there. I have now managed to provoke it on my Juno board with 
"driver_async_probe=*" (which does also require that patch for other 
reasons), and I think I've got a reasonable fix which I shall finish 
writing up and send shortly. Thanks for helping me nail this one down!

Cheers,
Robin.



More information about the linux-arm-kernel mailing list