[PATCH] nvmet-rdma: Suppress a class of lockdep complaints

Shinichiro Kawasaki shinichiro.kawasaki at wdc.com
Wed May 10 09:09:51 PDT 2023


On May 09, 2023 / 16:24, Sagi Grimberg wrote:
> 
> > > Bart, thank you very much for this immediate action after the
> > > discussion at LSF.
> > > This is encouraging. I applied the patch on top of v6.4-rc1 and ran
> > > the test
> > > case with various transports. Unfortunately, I observed kernel
> > > panics with rdma
> > > and siw transports [1][2]. Also I observed another lockdep WARN with tcp
> > > transport [3]. It looks that your fix unveiled more hidden issue/s.
> > 
> > Please use siw instead of rxe when running blktests - there are known
> > issues with the rxe driver.
> > 
> > Please apply these patches on top of kernel v6.3 instead of v6.4-rc1.
> > The hrtimer_interrupt() crash shown below is a v6.4-rc1 regression and
> > does not occur with the v6.3 kernel.
> > 
> > Since my patch is for the RDMA transport, it is not clear to me why a
> > report for the TCP transport is included in a reply to my patch?
> 
> Agree,

Sorry for my misunderstandings. I've tested again with siw and kernel v6.3.
Still I see the kernel panic. Here's the kernel messages I've got.

[   59.567730][  T935] rdma_rxe: loaded
[   59.614648][  T915] run blktests nvme/003 at 2023-05-10 08:48:26
[   59.714402][  T948] SoftiWARP attached
[   59.801368][  T969] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[   59.810654][  T970] iwpm_register_pid: Unable to send a nlmsg (client = 2)
[   59.813025][  T970] nvmet_rdma: enabling port 0 (10.0.2.15:4420)
[   59.861254][   T61] nvmet: creating discovery controller 1 for subsystem nqn.2014-08.org.nvmexpress..
[   59.869998][  T971] nvme nvme1: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 10.0.2.150
[   69.939259][  T982] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[   69.963920][    C2] ------------[ cut here ]------------
[   69.964368][    C2] DEBUG_LOCKS_WARN_ON(1)
[   69.964382][    C2] WARNING: CPU: 2 PID: 825 at kernel/locking/lockdep.c:232 __lock_acquire+0x28a4/00
[   69.965436][    C2] Modules linked in: siw rdma_rxe ib_uverbs ip6_udp_tunnel udp_tunnel nvmet_rdma ng
[   69.969849][    C2] CPU: 2 PID: 825 Comm: kworker/2:4 Not tainted 6.3.0+ #5
[   69.970389][    C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/4
[   69.971100][    C2] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
[   69.971671][    C2] RIP: 0010:__lock_acquire+0x28a4/0x5eb0
[   69.972090][    C2] Code: 08 84 d2 0f 85 52 22 00 00 83 3d 42 c2 18 04 00 75 b4 48 c7 c6 20 17 ce 85b
[   69.973525][    C2] RSP: 0018:ffff8883aef09ce8 EFLAGS: 00010092
[   69.973980][    C2] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff11075de1370
[   69.974565][    C2] RDX: 0000000000010003 RSI: 0000000000000004 RDI: 0000000000000001
[   69.975155][    C2] RBP: ffff8881215bc108 R08: 0000000000000001 R09: ffff8883aef3084b
[   69.975744][    C2] R10: ffffed1075de6109 R11: 0000000000000001 R12: 0000000000000002
[   69.976331][    C2] R13: 0000000000000000 R14: ffffffff8755632c R15: 00000000ffffffff
[   69.976922][    C2] FS:  0000000000000000(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
[   69.977576][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   69.978063][    C2] CR2: 0000563bf892e000 CR3: 000000012eb74000 CR4: 00000000000006e0
[   69.978651][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   69.979241][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   69.979830][    C2] Call Trace:
[   69.980074][    C2]  <IRQ>
[   69.980290][    C2]  ? lock_acquire+0x1b7/0x4e0
[   69.980641][    C2]  ? __pfx___lock_acquire+0x10/0x10
[   69.981030][    C2]  ? __pfx_lock_acquire+0x10/0x10
[   69.981402][    C2]  ? update_process_times+0x158/0x1d0
[   69.981804][    C2]  ? __pfx___lock_acquire+0x10/0x10
[   69.982192][    C2]  lock_acquire+0x1a7/0x4e0
[   69.982528][    C2]  ? hrtimer_interrupt+0x100/0x810
[   69.983794][    C2]  ? __pfx_lock_acquire+0x10/0x10
[   69.985043][    C2]  ? hrtimer_interrupt+0x339/0x810
[   69.986262][    C2]  ? kvm_clock_read+0x14/0x30
[   69.987449][    C2]  _raw_spin_lock_irqsave+0x47/0x70
[   69.988673][    C2]  ? hrtimer_interrupt+0x100/0x810
[   69.989922][    C2]  hrtimer_interrupt+0x100/0x810
[   69.991111][    C2]  ? __pfx_sched_clock_cpu+0x10/0x10
[   69.992319][    C2]  __sysvec_apic_timer_interrupt+0x146/0x3f0
[   69.993581][    C2]  sysvec_apic_timer_interrupt+0x8a/0xb0
[   69.994853][    C2]  </IRQ>
[   69.995882][    C2]  <TASK>
[   69.996900][    C2]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[   69.998122][    C2] RIP: 0010:lockdep_unregister_key+0x105/0x250
[   69.999323][    C2] Code: 7c 08 84 d2 0f 85 29 01 00 00 8b 05 75 7e 19 04 85 c0 74 02 0f 0b e8 6a e7f
[   70.002306][    C2] RSP: 0018:ffff88810ad67ca0 EFLAGS: 00000206
[   70.003521][    C2] RAX: 0000000000000002 RBX: ffffffff897afb78 RCX: 0000000000000001
[   70.004921][    C2] RDX: 0000000000000000 RSI: ffffffff85ce0fa0 RDI: ffffffff85fa7100
[   70.006259][    C2] RBP: ffff88812da24ae0 R08: 0000000000000000 R09: ffff8883aef45c2f
[   70.007590][    C2] R10: ffffed1075de8b85 R11: ffff8881215bb280 R12: 0000000000000000
[   70.008924][    C2] R13: 0000000000000246 R14: ffffffff89939550 R15: ffff88812f5c3118
[   70.010209][    C2]  nvmet_rdma_free_queue+0x2e/0x390 [nvmet_rdma]
[   70.011363][    C2]  nvmet_rdma_release_queue_work+0x3e/0x90 [nvmet_rdma]
[   70.012535][    C2]  process_one_work+0x7e4/0x1390
[   70.013568][    C2]  ? __pfx_lock_acquire+0x10/0x10
[   70.014594][    C2]  ? __pfx_process_one_work+0x10/0x10
[   70.015661][    C2]  ? __pfx_do_raw_spin_lock+0x10/0x10
[   70.016745][    C2]  worker_thread+0xf7/0x12b0
[   70.017730][    C2]  ? __kthread_parkme+0xc1/0x1f0
[   70.018687][    C2]  ? __pfx_worker_thread+0x10/0x10
[   70.019645][    C2]  kthread+0x29e/0x340
[   70.020517][    C2]  ? __pfx_kthread+0x10/0x10
[   70.021427][    C2]  ret_from_fork+0x2c/0x50
[   70.022323][    C2]  </TASK>
[   70.023111][    C2] irq event stamp: 3790
[   70.023976][    C2] hardirqs last  enabled at (3789): [<ffffffff85a00e86>] asm_sysvec_apic_timer_int0
[   70.025311][    C2] hardirqs last disabled at (3790): [<ffffffff85907a9a>] sysvec_apic_timer_interru0
[   70.026609][    C2] softirqs last  enabled at (3158): [<ffffffff832483be>] __irq_exit_rcu+0xfe/0x260
[   70.027887][    C2] softirqs last disabled at (3151): [<ffffffff832483be>] __irq_exit_rcu+0xfe/0x260
[   70.029108][    C2] ---[ end trace 0000000000000000 ]---
[   70.030101][    C2] general protection fault, probably for non-canonical address 0xdffffc0000000008:I
[   70.031556][    C2] KASAN: null-ptr-deref in range [0x0000000000000040-0x0000000000000047]
[   70.032794][    C2] CPU: 2 PID: 825 Comm: kworker/2:4 Tainted: G        W          6.3.0+ #5
[   70.034045][    C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/4
[   70.035360][    C2] Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
[   70.036559][    C2] RIP: 0010:__lock_acquire+0x2481/0x5eb0
[   70.037617][    C2] Code: 0f 83 a7 03 00 00 48 8d 1c 5b 48 c1 e3 06 48 81 c3 e0 3f 7b 89 48 b8 00 00f
[   70.040455][    C2] RSP: 0018:ffff8883aef09ce8 EFLAGS: 00010002
[   70.041629][    C2] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff11075de1370
[   70.042955][    C2] RDX: 0000000000000008 RSI: 0000000000000004 RDI: 0000000000000040
[   70.044233][    C2] RBP: ffff8881215bc108 R08: 0000000000000001 R09: ffff8883aef3084b
[   70.045519][    C2] R10: ffffed1075de6109 R11: ffff8881215bb280 R12: 0000000000000002
[   70.046855][    C2] R13: 0000000000000000 R14: ffffffff8755632c R15: 00000000ffffffff
[   70.048148][    C2] FS:  0000000000000000(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
[   70.049519][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.050776][    C2] CR2: 0000563bf892e000 CR3: 000000012eb74000 CR4: 00000000000006e0
[   70.052120][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   70.053428][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   70.054774][    C2] Call Trace:
[   70.055795][    C2]  <IRQ>
[   70.056779][    C2]  ? lock_acquire+0x1b7/0x4e0
[   70.057890][    C2]  ? __pfx___lock_acquire+0x10/0x10
[   70.058980][    C2]  ? __pfx_lock_acquire+0x10/0x10
[   70.060031][    C2]  ? update_process_times+0x158/0x1d0
[   70.061089][    C2]  ? __pfx___lock_acquire+0x10/0x10
[   70.062128][    C2]  lock_acquire+0x1a7/0x4e0
[   70.063114][    C2]  ? hrtimer_interrupt+0x100/0x810
[   70.064133][    C2]  ? __pfx_lock_acquire+0x10/0x10
[   70.065138][    C2]  ? hrtimer_interrupt+0x339/0x810
[   70.066153][    C2]  ? kvm_clock_read+0x14/0x30
[   70.067127][    C2]  _raw_spin_lock_irqsave+0x47/0x70
[   70.068136][    C2]  ? hrtimer_interrupt+0x100/0x810
[   70.069135][    C2]  hrtimer_interrupt+0x100/0x810
[   70.070114][    C2]  ? __pfx_sched_clock_cpu+0x10/0x10
[   70.071129][    C2]  __sysvec_apic_timer_interrupt+0x146/0x3f0
[   70.072201][    C2]  sysvec_apic_timer_interrupt+0x8a/0xb0
[   70.073246][    C2]  </IRQ>
[   70.074097][    C2]  <TASK>
[   70.074941][    C2]  asm_sysvec_apic_timer_interrupt+0x16/0x20
[   70.076012][    C2] RIP: 0010:lockdep_unregister_key+0x105/0x250
[   70.077089][    C2] Code: 7c 08 84 d2 0f 85 29 01 00 00 8b 05 75 7e 19 04 85 c0 74 02 0f 0b e8 6a e7f
[   70.079842][    C2] RSP: 0018:ffff88810ad67ca0 EFLAGS: 00000206
[   70.080983][    C2] RAX: 0000000000000002 RBX: ffffffff897afb78 RCX: 0000000000000001
[   70.082242][    C2] RDX: 0000000000000000 RSI: ffffffff85ce0fa0 RDI: ffffffff85fa7100
[   70.083498][    C2] RBP: ffff88812da24ae0 R08: 0000000000000000 R09: ffff8883aef45c2f
[   70.084767][    C2] R10: ffffed1075de8b85 R11: ffff8881215bb280 R12: 0000000000000000
[   70.086040][    C2] R13: 0000000000000246 R14: ffffffff89939550 R15: ffff88812f5c3118
[   70.087312][    C2]  nvmet_rdma_free_queue+0x2e/0x390 [nvmet_rdma]
[   70.088457][    C2]  nvmet_rdma_release_queue_work+0x3e/0x90 [nvmet_rdma]
[   70.089629][    C2]  process_one_work+0x7e4/0x1390
[   70.090651][    C2]  ? __pfx_lock_acquire+0x10/0x10
[   70.091665][    C2]  ? __pfx_process_one_work+0x10/0x10
[   70.092698][    C2]  ? __pfx_do_raw_spin_lock+0x10/0x10
[   70.093717][    C2]  worker_thread+0xf7/0x12b0
[   70.094656][    C2]  ? __kthread_parkme+0xc1/0x1f0
[   70.095592][    C2]  ? __pfx_worker_thread+0x10/0x10
[   70.096533][    C2]  kthread+0x29e/0x340
[   70.097392][    C2]  ? __pfx_kthread+0x10/0x10
[   70.098286][    C2]  ret_from_fork+0x2c/0x50
[   70.099170][    C2]  </TASK>
[   70.099941][    C2] Modules linked in: siw rdma_rxe ib_uverbs ip6_udp_tunnel udp_tunnel nvmet_rdma ng
[   70.107198][    C2] ---[ end trace 0000000000000000 ]---
[   70.108252][    C2] RIP: 0010:__lock_acquire+0x2481/0x5eb0
[   70.109319][    C2] Code: 0f 83 a7 03 00 00 48 8d 1c 5b 48 c1 e3 06 48 81 c3 e0 3f 7b 89 48 b8 00 00f
[   70.112125][    C2] RSP: 0018:ffff8883aef09ce8 EFLAGS: 00010002
[   70.113284][    C2] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff11075de1370
[   70.114588][    C2] RDX: 0000000000000008 RSI: 0000000000000004 RDI: 0000000000000040
[   70.115899][    C2] RBP: ffff8881215bc108 R08: 0000000000000001 R09: ffff8883aef3084b
[   70.117205][    C2] R10: ffffed1075de6109 R11: ffff8881215bb280 R12: 0000000000000002
[   70.118517][    C2] R13: 0000000000000000 R14: ffffffff8755632c R15: 00000000ffffffff
[   70.119840][    C2] FS:  0000000000000000(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
[   70.121226][    C2] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.122449][    C2] CR2: 0000563bf892e000 CR3: 000000012eb74000 CR4: 00000000000006e0
[   70.123791][    C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   70.125131][    C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   70.126462][    C2] Kernel panic - not syncing: Fatal exception in interrupt
[   70.127873][    C2] Kernel Offset: 0x2000000 from 0xffffffff81000000 (relocation range: 0xffffffff80)
[   70.129488][    C2] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---



More information about the Linux-nvme mailing list