[PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe path
Robin Murphy
robin.murphy at arm.com
Thu Apr 24 06:58:32 PDT 2025
On 15/04/2025 4:08 pm, Johan Hovold wrote:
> On Mon, Apr 14, 2025 at 04:37:59PM +0100, Robin Murphy wrote:
>> On 2025-04-11 9:02 am, Johan Hovold wrote:
>>> On Fri, Feb 28, 2025 at 03:46:33PM +0000, Robin Murphy wrote:
>
>>>> @@ -155,7 +155,12 @@ int of_iommu_configure(struct device *dev, struct device_node *master_np,
>>>> dev_iommu_free(dev);
>>>> mutex_unlock(&iommu_probe_device_lock);
>>>>
>>>> - if (!err && dev->bus)
>>>> + /*
>>>> + * If we're not on the iommu_probe_device() path (as indicated by the
>>>> + * initial dev->iommu) then try to simulate it. This should no longer
>>>> + * happen unless of_dma_configure() is being misused outside bus code.
>>>> + */
>>>
>>> This assumption does not hold as there is nothing preventing iommu
>>> driver probe from racing with a client driver probe.
>>
>> Not sure I follow - *this* assumption is that if we arrived here with
>> dev->iommu already allocated then __iommu_probe_device() is already in
>> progress for this device, either in the current callchain or on another
>> thread, and so we can (and should) skip calling into it again. There's
>> no ambiguity about that.
>
> I was referring to the this "should no longer happen unless
> of_dma_configure() is being misused outside bus code" claim, which
> appears to be false given the splat below.
That's not an assumption so much as a statement of intent. And really
it's the other way round anyway, as a reminder that this replay call is
only still here (unlike in the ACPI path) because there *is* still
plenty of sketchy usage of of_dma_configure() which I'm wary of breaking.
>>>> + if (!err && dev->bus && !dev_iommu_present)
>>>> err = iommu_probe_device(dev);
>>>>
>>>> if (err && err != -EPROBE_DEFER)
>>>
>>> I hit the (now moved) dev_WARN() on the ThinkPad T14s where the GPU SMMU
>>> is probed late due to a clock dependency and can end up probing in
>>> parallel with the GPU driver.
>>
>> And what *should* happen is that the GPU driver probe waits for the
>> IOMMU driver probe to finish. Do you have fw_devlink enabled?
>
> Yes, but you shouldn't rely on devlinks for correctness.
>
> That said it does seem like something is not working as you think it is
> here, and indeed the iommu supplier link is not created until SMMUv2
> probe_device() (see arm_smmu_probe_device()).
>
> So client devices can start to be probed (bus dma_configure() is called)
> before their iommu is ready also with devlinks enabled (and I do see
> this happen on every boot).
I didn't mean the explicit power management links created by the SMMU
driver itself, I meant the fwnode links created automatically by
fw_devlink_link_device() at device_add() time, which infer a
supplier-consumer relationship from the "iommus" DT property, wherein
device_links_check_suppliers() would then defer the GPU driver probe
until the SMMU driver probe has completed successfully probing and
called device_links_driver_bound().
Except it turns out that "iommus" is one of the optional properties
which are only linked that way under "fw_devlink=strict", so that
explains that, fair enough.
>>> [ 3.805282] arm-smmu 3da0000.iommu: probing hardware configuration...
>
>>> [ 3.829042] platform 3d6a000.gmu: Adding to iommu group 8
>>>
>>> [ 3.992050] ------------[ cut here ]------------
>>> [ 3.993045] adreno 3d00000.gpu: late IOMMU probe at driver bind, something fishy here!
>>> [ 3.994058] WARNING: CPU: 9 PID: 343 at drivers/iommu/iommu.c:579 __iommu_probe_device+0x2b0/0x4ac
>>>
>>> [ 4.003272] CPU: 9 UID: 0 PID: 343 Comm: kworker/u50:2 Not tainted 6.15.0-rc1 #109 PREEMPT
>>> [ 4.003276] Hardware name: LENOVO 21N2ZC5PUS/21N2ZC5PUS, BIOS N42ET83W (2.13 ) 10/04/2024
>>>
>>> [ 4.025943] Call trace:
>>> [ 4.025945] __iommu_probe_device+0x2b0/0x4ac (P)
>>> [ 4.030453] iommu_probe_device+0x38/0x7c
>>> [ 4.030455] of_iommu_configure+0x188/0x26c
>>> [ 4.030457] of_dma_configure_id+0xcc/0x300
>>> [ 4.030460] platform_dma_configure+0x74/0xac
>>> [ 4.030462] really_probe+0x74/0x38c
>>
>> Indeed this is exactly what is *not* supposed to be happening - does
>> this patch help at all?
>>
>> https://lore.kernel.org/linux-iommu/09d901ad11b3a410fbb6e27f7d04ad4609c3fe4a.1741706365.git.robin.murphy@arm.com/
>
> I've only seen that splat once so far so I don't have a reliable
> reproducer.
>
> But AFAICS that patch won't help help here where we appear to have iommu
> probe racing with bus dma_configure() called from really_probe() for the
> client device.
Well, tightening up __iommu_probe_device() would stand to slightly
reduce the window in general while bus_set_iommu() is running. However
you're right that this is a different race from the ones implicated
there. I have now managed to provoke it on my Juno board with
"driver_async_probe=*" (which does also require that patch for other
reasons), and I think I've got a reasonable fix which I shall finish
writing up and send shortly. Thanks for helping me nail this one down!
Cheers,
Robin.
More information about the linux-arm-kernel
mailing list