[PATCH v2 00/10] Refine the locking for dev->iommu_group
Marek Szyprowski
m.szyprowski at samsung.com
Tue Aug 8 06:00:30 PDT 2023
Hi All,
On 08.08.2023 14:32, Marek Szyprowski wrote:
> On 08.08.2023 12:31, Chen-Yu Tsai wrote:
>> On Mon, Aug 7, 2023 at 8:54 PM Joerg Roedel <joro at 8bytes.org> wrote:
>>> On Mon, Jul 31, 2023 at 02:50:23PM -0300, Jason Gunthorpe wrote:
>>>> Jason Gunthorpe (10):
>>>> iommu: Remove useless group refcounting
>>>> iommu: Add a lockdep assertion for remaining dev->iommu_group reads
>>>> iommu: Add generic_single_device_group()
>>>> iommu/sun50i: Convert to generic_single_device_group()
>>>> iommu/sprd: Convert to generic_single_device_group()
>>>> iommu/rockchip: Convert to generic_single_device_group()
>>>> iommu/ipmmu-vmsa: Convert to generic_single_device_group()
>>>> iommu/omap: Convert to generic_single_device_group()
>>>> iommu: Complete the locking for dev->iommu_group
>>>> iommu/intel: Fix missing locking for
>>>> show_device_domain_translation()
>>>>
>>>> drivers/iommu/intel/debugfs.c | 34 ++++----
>>>> drivers/iommu/iommu.c | 155
>>>> +++++++++++++++++++++------------
>>>> drivers/iommu/ipmmu-vmsa.c | 22 ++---
>>>> drivers/iommu/omap-iommu.c | 30 +------
>>>> drivers/iommu/omap-iommu.h | 2 +-
>>>> drivers/iommu/rockchip-iommu.c | 22 +----
>>>> drivers/iommu/sprd-iommu.c | 24 +----
>>>> drivers/iommu/sun50i-iommu.c | 29 ++----
>>>> include/linux/iommu.h | 3 +
>>>> 9 files changed, 138 insertions(+), 183 deletions(-)
>>> Applied, thanks for the nice cleanup!
>> This series seems to cause a hung task during boot on MediaTek
>> platforms.
>> It hangs with next-20230808. Reverting the 10 commits from this series
>> makes the system boot up again.
>
> I confirm that next-20230808 is broken on ARM 32bit based Exynos
> boards too. Boards lock up very early during boot. I will try to
> investigate this soon.
Hmm this turned to be Exynos IOMMU specific, but the issue is probably
somehow generic.
The deadlock happens early in __iommu_probe_device() on
device_lock(dev). Here is a stack dump of that call:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc5-next-20230808-dirty
#7013
Hardware name: Samsung Exynos (Flattened Device Tree)
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x58/0x70
dump_stack_lvl from __iommu_probe_device+0x3d8/0x4ac
__iommu_probe_device from probe_iommu_group+0x8/0x14
probe_iommu_group from bus_for_each_dev+0x60/0xb4
bus_for_each_dev from bus_iommu_probe+0x34/0x118
bus_iommu_probe from iommu_device_register+0x98/0x100
iommu_device_register from exynos_sysmmu_probe+0x238/0x3c0
exynos_sysmmu_probe from platform_probe+0x80/0xc0
platform_probe from really_probe+0x154/0x3d4
really_probe from __driver_probe_device+0xa0/0x1e8
__driver_probe_device from driver_probe_device+0x30/0xd0
driver_probe_device from __device_attach_driver+0xbc/0x11c
__device_attach_driver from bus_for_each_drv+0x74/0xc0
bus_for_each_drv from __device_attach+0xec/0x1b4
__device_attach from bus_probe_device+0x8c/0x90
bus_probe_device from device_add+0x5b8/0x78c
device_add from of_platform_device_create_pdata+0x94/0xcc
of_platform_device_create_pdata from of_platform_bus_create+0x1ac/0x4d8
of_platform_bus_create from of_platform_bus_create+0x214/0x4d8
of_platform_bus_create from of_platform_populate+0x80/0x114
of_platform_populate from of_platform_default_populate_init+0xcc/0xe4
of_platform_default_populate_init from do_one_initcall+0x6c/0x318
do_one_initcall from kernel_init_freeable+0x1c4/0x214
kernel_init_freeable from kernel_init+0x18/0x12c
kernel_init from ret_from_fork+0x14/0x2c
The problem here is that exynos_sysmmu_probe() is by design called under
device_lock, then it calls iommu_device_register(), which in turn
triggers calling __iommu_probe_device() on all platform devices in the
system, while the still probed sysmmu device is one of them.
Frankly speaking I have no idea how to defer calling
iommu_device_register() to avoid this deadlock. Any ideas?
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
More information about the Linux-rockchip
mailing list