[PATCH v2 00/10] Refine the locking for dev->iommu_group

Chen-Yu Tsai wenst at chromium.org
Tue Aug 8 03:31:47 PDT 2023


Hi,

On Mon, Aug 7, 2023 at 8:54 PM Joerg Roedel <joro at 8bytes.org> wrote:
>
> On Mon, Jul 31, 2023 at 02:50:23PM -0300, Jason Gunthorpe wrote:
> > Jason Gunthorpe (10):
> >   iommu: Remove useless group refcounting
> >   iommu: Add a lockdep assertion for remaining dev->iommu_group reads
> >   iommu: Add generic_single_device_group()
> >   iommu/sun50i: Convert to generic_single_device_group()
> >   iommu/sprd: Convert to generic_single_device_group()
> >   iommu/rockchip: Convert to generic_single_device_group()
> >   iommu/ipmmu-vmsa: Convert to generic_single_device_group()
> >   iommu/omap: Convert to generic_single_device_group()
> >   iommu: Complete the locking for dev->iommu_group
> >   iommu/intel: Fix missing locking for show_device_domain_translation()
> >
> >  drivers/iommu/intel/debugfs.c  |  34 ++++----
> >  drivers/iommu/iommu.c          | 155 +++++++++++++++++++++------------
> >  drivers/iommu/ipmmu-vmsa.c     |  22 ++---
> >  drivers/iommu/omap-iommu.c     |  30 +------
> >  drivers/iommu/omap-iommu.h     |   2 +-
> >  drivers/iommu/rockchip-iommu.c |  22 +----
> >  drivers/iommu/sprd-iommu.c     |  24 +----
> >  drivers/iommu/sun50i-iommu.c   |  29 ++----
> >  include/linux/iommu.h          |   3 +
> >  9 files changed, 138 insertions(+), 183 deletions(-)
>
> Applied, thanks for the nice cleanup!

This series seems to cause a hung task during boot on MediaTek platforms.
It hangs with next-20230808. Reverting the 10 commits from this series
makes the system boot up again.

ChenYu

Logs follow.

INFO: task swapper/0:1 blocked for more than 122 seconds.
      Not tainted 6.5.0-rc5-next-20230808-08004-g396bbe23dbf4 #859
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:swapper/0       state:D stack:0     pid:1     ppid:0      flags:0x00000008
Call trace:
 __switch_to+0x138/0x1e8
 __schedule+0x728/0x1388
 schedule+0xa8/0x170
 schedule_timeout+0x19c/0x1b8
 __wait_for_common+0x250/0x2c0
 wait_for_completion+0x28/0x40
 __flush_work+0x37c/0x6c0
 flush_work+0x1c/0x30
 deferred_probe_initcall+0x60/0xd0
 do_one_initcall+0xe0/0x4a0
 kernel_init_freeable+0x3a4/0x730
 kernel_init+0x2c/0x1f8
 ret_from_fork+0x10/0x20
INFO: task kworker/u18:1:67 blocked for more than 122 seconds.
      Not tainted 6.5.0-rc5-next-20230808-08004-g396bbe23dbf4 #859
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u18:1   state:D stack:0     pid:67    ppid:2      flags:0x00000008
Workqueue: events_unbound deferred_probe_work_func
Call trace:
 __switch_to+0x138/0x1e8
 __schedule+0x728/0x1388
 schedule+0xa8/0x170
 schedule_preempt_disabled+0x44/0x80
 __mutex_lock+0x3fc/0x598
 mutex_lock_nested+0x2c/0x40
 __iommu_probe_device+0xb8/0x6e0
 probe_iommu_group+0x18/0x38
 bus_for_each_dev+0xe4/0x168
 bus_iommu_probe+0x8c/0x240
 iommu_device_register+0x120/0x1b0
 mtk_iommu_probe+0x494/0x7a0
 platform_probe+0x94/0x100
 really_probe+0x1e4/0x3e8
 __driver_probe_device+0xc0/0x1a0
 driver_probe_device+0x110/0x1f0
 __device_attach_driver+0xf0/0x1b0
 bus_for_each_drv+0xf0/0x170
 __device_attach+0x120/0x240
 device_initial_probe+0x1c/0x30
 bus_probe_device+0xdc/0xe8
 deferred_probe_work_func+0xf0/0x140
 process_one_work+0x3b0/0x910
 worker_thread+0x33c/0x610
 kthread+0x1dc/0x1f0
 ret_from_fork+0x10/0x20

Showing all locks held in the system:
4 locks held by kworker/u18:1/67:
 #0: ffffff80c001b538 ((wq_completion)events_unbound){+.+.}-{0:0}, at:
process_one_work+0x2c4/0x910
 #1: ffffffc080777d40 (deferred_probe_work){+.+.}-{0:0}, at:
process_one_work+0x2c4/0x910
 #2: ffffff80c14090f8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x8c/0x240
 #3: ffffff80c14090f8 (&dev->mutex){....}-{3:3}, at:
__iommu_probe_device+0xb8/0x6e0
1 lock held by khungtaskd/70:
 #0: ffffffe379f945a0 (rcu_read_lock){....}-{1:2}, at:
debug_show_all_locks+0x24/0x220

=============================================

Kernel panic - not syncing: hung_task: blocked tasks
CPU: 4 PID: 70 Comm: khungtaskd Not tainted
6.5.0-rc5-next-20230808-08004-g396bbe23dbf4 #859
55426859c267064a381312eb869e94c28566a87f
Hardware name: Google juniper sku16 board (DT)
Call trace:
 dump_backtrace+0xa0/0x100
 show_stack+0x20/0x38
 dump_stack_lvl+0xdc/0x148
 dump_stack+0x1c/0x28
 panic+0x460/0x4d8
 watchdog+0x4a4/0x9d0
 kthread+0x1dc/0x1f0
 ret_from_fork+0x10/0x20
SMP: stopping secondary CPUs
Kernel Offset: 0x22f7000000 from 0xffffffc080000000
PHYS_OFFSET: 0x40000000
CPU features: 0x0000000c,92010000,0800421b
Memory Limit: none
Rebooting in 30 seconds..



More information about the linux-arm-kernel mailing list