[PATCH 0/3] nvme: fix system fault observed while shutting down controller

Nilay Shroff nilay at linux.ibm.com
Sun Oct 27 10:02:03 PDT 2024


Hi,

This patch series addresses the system fault observed while shutting
down fabric controller. We already fixed it[1] earlier however it was
later relaized that we do have a better and optimal way to address it
[2].

The first patch in the series reverts the changes implemented in [3] and
[4]. So essentially we're making keep-alive operation asynchronous again 
as it was earlier.
The second patch in the series fix the kernel crash observed while 
shutting down fabric controller.
The third patch in the series uses the nvme_ctrl_state function for 
retrieving the controller state.

The system fault was observed due to the keep-alive request sneaking in
while shutting down fabric controller. We encounter the below intermittent
kernel crash while running blktest nvme/037:

dmesg output:
------------
run blktests nvme/037 at 2024-10-04 03:59:27
<snip>
nvme nvme1: new ctrl: "blktests-subsystem-5"
nvme nvme1: Failed to configure AEN (cfg 300)
nvme nvme1: Removing ctrl: NQN "blktests-subsystem-5"
nvme nvme1: long keepalive RTT (54760 ms)
nvme nvme1: failed nvme_keep_alive_end_io error=4
BUG: Kernel NULL pointer dereference on read at 0x00000080
Faulting instruction address: 0xc00000000091c9f8
Oops: Kernel access of bad area, sig: 7 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
<snip>
CPU: 28 UID: 0 PID: 338 Comm: kworker/u263:2 Kdump: loaded Not tainted 6.11.0+ #89
Hardware name: IBM,9043-MRX POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.00 (NM1060_028) hv:phyp pSeries
Workqueue: nvme-wq nvme_keep_alive_work [nvme_core]
NIP:  c00000000091c9f8 LR: c00000000084150c CTR: 0000000000000004
<snip>
NIP [c00000000091c9f8] sbitmap_any_bit_set+0x68/0xb8
LR [c00000000084150c] blk_mq_do_dispatch_ctx+0xcc/0x280
Call Trace:
    autoremove_wake_function+0x0/0xbc (unreliable)
    __blk_mq_sched_dispatch_requests+0x114/0x24c
    blk_mq_sched_dispatch_requests+0x44/0x84
    blk_mq_run_hw_queue+0x140/0x220
    nvme_keep_alive_work+0xc8/0x19c [nvme_core]
    process_one_work+0x200/0x4e0
    worker_thread+0x340/0x504
    kthread+0x138/0x140
    start_kernel_thread+0x14/0x18

We realized  that the above crash is regression caused due to changes 
implemented in commit a54a93d0e359 ("nvme: move stopping keep-alive into 
nvme_uninit_ctrl()"). Ideally we should stop keep-alive at the very 
beggining of the controller shutdown code path so that it wouldn't sneak 
in or interfere with the shutdown operation. However we removed the keep
alive stop operation from the beginning of the controller shutdown code 
path in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_
uninit_ctrl()") and that now created the possibility of keep-alive 
sneaking in and interfering with the shutdown operation and causing 
observed kernel crash. So to fix this crash, now we're adding back the
keep-alive stop operation at very beginning of the fabric controller
shutdown code path so that the actual controller shutdown opeation only
begins after it's ensured that keep-alive operation is not in-flight and
also it can't be scheduled in future. This fixed in the second patch of 
the series. 

The third patch in the series addresses the use of ctrl->lock before
accessing NVMe controller state in nvme_keep_alive_end_io function.
With introduction of helper nvme_ctrl_state, we no longer need to
first acquire ctrl->lock before accessing the NVMe controller state.
So this patch removes the use of ctrl->lock from nvme_keep_alive_end_io
function and replaces it with helper nvme_ctrl_state call.

[1]https://lore.kernel.org/all/ZxFSkNI2p65ucTB5@kbusch-mbp.dhcp.thefacebook.com/
[2]https://lore.kernel.org/all/196f4013-3bbf-43ff-98b4-9cb2a96c20c2@grimberg.me/
[3]https://lore.kernel.org/all/20241016030339.54029-3-nilay@linux.ibm.com/
[4]https://lore.kernel.org/all/20241016030339.54029-4-nilay@linux.ibm.com/

Nilay Shroff (3):
  Revert "nvme: make keep-alive synchronous operation"
  nvme-fabrics: fix kernel crash while shutting down controller
  nvme: use helper nvme_ctrl_state in nvme_keep_alive_end_io function

 drivers/nvme/host/core.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

-- 
2.45.2




More information about the Linux-nvme mailing list