[PATCH 1/2] nvme-tcp: avoid race between nvme scan and reset
Chaitanya Kulkarni
chaitanyak at nvidia.com
Mon Jun 2 17:48:15 PDT 2025
On 6/1/25 21:35, Shin'ichiro Kawasaki wrote:
> When the nvme scan work and the nvme controller reset work race, the
> WARN below happens:
>
> WARNING: CPU: 1 PID: 69 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0
>
> The WARN can be recreated by repeating the blktests test case nvme/063 a
> few times [1]. Its cause is the new queue allocation for the tag set
> by the scan work between blk_mq_quiesce_tagset() and
> blk_mq_unquiesce_tagset() calls by the reset work.
>
> Reset work Scan work
> ------------------------------------------------------------------------
> nvme_reset_ctrl_work()
> nvme_tcp_teardown_ctrl()
> nvme_tcp_teardown_io_queues()
> nvme_quiesce_io_queues()
> blk_mq_quiesce_tagset()
> list_for_each_entry() .. N queues
> blk_mq_quiesce_queue()
> nvme_scan_work()
> nvme_scan_ns_*()
> nvme_scan_ns()
> nvme_alloc_ns()
> blk_mq_alloc_disk()
> __blk_mq_alloc_disk()
> blk_mq_alloc_queue()
> blk_mq_init_allocate_queue()
> blk_mq_add_queue_tag_set()
> list_add_tail() .. N+1 queues
> nvme_tcp_setup_ctrl()
> nvme_start_ctrl()
> nvme_unquiesce_io_queues()
> blk_mq_unquiesce_tagset()
> list_for_each_entry() .. N+1 queues
> blk_mq_unquiesce_queue()
> WARN_ON_ONCE(q->quiesce_depth <= 0)
>
> blk_mq_quiesce_queue() is not called for the new queue added by the scan
> work, while blk_mq_unquiesce_queue() is called for it. Hence the WARN.
>
> To suppress the WARN, avoid the race between the reset work and the scan
> work by flushing the scan work at the beginning of the reset work.
>
> Link:https://lore.kernel.org/linux-nvme/6mhxskdlbo6fk6hotsffvwriauurqky33dfb3s44mqtr5dsxmf@gywwmnyh3twm/ [1]
> Signed-off-by: Shin'ichiro Kawasaki<shinichiro.kawasaki at wdc.com>
Looks good.
Reviewed-by: Chaitanya Kulkarni <kch at nvidia.com>
-ck
More information about the Linux-nvme
mailing list