nvme_tcp BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
Engel, Amit
Amit.Engel at Dell.com
Wed Jun 9 00:48:59 PDT 2021
Hi Sagi,
Indeed RHEL8.3 does not have the mutex protection on nvme_tcp_stop_queue
However, in our case, based on the below back trace
We don't get to __nvme_tcp_stop_queue from nvme_tcp_stop_queue
We get to it from:
nvme_tcp_reconnect_ctrl_work --> nvme_tcp_setup_ctrl --> nvme_tcp_start_queue --> __nvme_tcp_stop_queue
so I'm not sure how this mutex protection will help in this case
crash> bt -l
PID: 193053 TASK: ffff9491bdad17c0 CPU: 7 COMMAND: "kworker/u193:9"
#0 [ffffb2e9cfdbbb70] machine_kexec at ffffffffb245bf3e
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/arch/x86/kernel/machine_kexec_64.c: 389
#1 [ffffb2e9cfdbbbc8] __crash_kexec at ffffffffb256072d
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/kernel/kexec_core.c: 956
#2 [ffffb2e9cfdbbc90] crash_kexec at ffffffffb256160d
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/./include/linux/compiler.h: 219
#3 [ffffb2e9cfdbbca8] oops_end at ffffffffb2422d4d
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/arch/x86/kernel/dumpstack.c: 334
#4 [ffffb2e9cfdbbcc8] no_context at ffffffffb246ba9e
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/arch/x86/mm/fault.c: 773
#5 [ffffb2e9cfdbbd20] do_page_fault at ffffffffb246c5c2
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/./arch/x86/include/asm/jump_label.h: 38
#6 [ffffb2e9cfdbbd50] page_fault at ffffffffb2e0122e
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/arch/x86/entry/entry_64.S: 1183
[exception RIP: _raw_write_lock_bh+23]
RIP: ffffffffb2cd6cc7 RSP: ffffb2e9cfdbbe00 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff94b2aefb4000 RCX: 0000000000000003
RDX: 00000000000000ff RSI: 00000000fffffe01 RDI: 0000000000000230
RBP: ffff94923f793f40 R8: ffff9492ff1ea7f8 R9: 0000000000000000
R10: 0000000000000000 R11: ffff9492ff1e8c64 R12: ffff94b2b7210338
R13: 0000000000000000 R14: ffff94b27f7a4100 R15: ffff94b2b72110a0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/./arch/x86/include/asm/atomic.h: 194
#7 [ffffb2e9cfdbbe00] __nvme_tcp_stop_queue at ffffffffc02dc0aa [nvme_tcp]
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/drivers/nvme/host/tcp.c: 1486
#8 [ffffb2e9cfdbbe18] nvme_tcp_start_queue at ffffffffc02dcd18 [nvme_tcp]
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/drivers/nvme/host/tcp.c: 1525
#9 [ffffb2e9cfdbbe38] nvme_tcp_setup_ctrl at ffffffffc02df258 [nvme_tcp]
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/drivers/nvme/host/tcp.c: 1814
#10 [ffffb2e9cfdbbe80] nvme_tcp_reconnect_ctrl_work at ffffffffc02df4bf [nvme_tcp]
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/drivers/nvme/host/tcp.c: 1962
#11 [ffffb2e9cfdbbe98] process_one_work at ffffffffb24d3477
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/./arch/x86/include/asm/jump_label.h: 38
#12 [ffffb2e9cfdbbed8] worker_thread at ffffffffb24d3b40
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/./include/linux/compiler.h: 193
#13 [ffffb2e9cfdbbf10] kthread at ffffffffb24d9502
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/kernel/kthread.c: 280
#14 [ffffb2e9cfdbbf50] ret_from_fork at ffffffffb2e00255
/usr/src/debug/kernel-4.18.0-240.el8/linux-4.18.0-240.el8.x86_64/arch/x86/entry/entry_64.S: 360
-----Original Message-----
From: Sagi Grimberg <sagi at grimberg.me>
Sent: Wednesday, June 9, 2021 2:39 AM
To: Engel, Amit; linux-nvme at lists.infradead.org
Cc: Anner, Ran; Grupi, Elad
Subject: Re: nvme_tcp BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
[EXTERNAL EMAIL]
> Hi Sagi,
>
> A correction to the below analysis:
> It seems like sock->sk is NULL and not queue->sock
>
> As part of _nvme_tcp_stop_queue
> kernel_sock_shutdown and nvme_tcp_restore_sock_calls are being called:
> kernel_sock_shutdown leads to nvme_tcp_state_change which will trigger err_work (nvme_tcp_error_recovery_work)
>
> As part of nvme_tcp_error_recovery_work, nvme_tcp_free_queue is being called which releases the socket (sock_release)
>
> In our case, based on the below bt:
> nvme_tcp_error_recovery_work is being triggered (and so sock_release) before nvme_tcp_restore_sock_calls , which end up with NULL dereference pointer at 'rwlock_t sk_callback_lock' ?
>
> Can you please review and provide your inputs for this potential race ?
Seems that RH8.3 is missing the mutex protection on nvme_tcp_stop_queue.
I'm assuming it doesn't happen upstream?
More information about the Linux-nvme
mailing list