[bug report] nvme/063 failure (tcp transport)
Shinichiro Kawasaki
shinichiro.kawasaki at wdc.com
Wed May 21 04:51:42 PDT 2025
On May 20, 2025 / 13:45, Hannes Reinecke wrote:
...
> Hmm.
> Can you check with this patch (on top of the previous one):
>
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 55569eb7770b..43d86e8c6df3 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -2197,6 +2197,7 @@ static int nvme_tcp_configure_io_queues(struct
> nvme_ctrl *ctrl, bool new)
> blk_mq_update_nr_hw_queues(ctrl->tagset,
> ctrl->queue_count - 1);
> nvme_unfreeze(ctrl);
> + nvme_quiesce_io_queues(ctrl);
> }
>
> /*
I applied the patch above on top of the previous patch, but the WARN
and the KASAN sauf were still observed...
But this patch gave me some hints that the nvme_quiesce_io_queues(ctrl) could be
related. Based on it, I took a closer look. I found blk_mq_unquiesce_tagset() is
in the WARN call trace. I created a debug patch [2] to check how many queues are
allocated for the target tag set in blk_mq_quiesce_tagset() and
blk_mq_unquiesce_tagset(). Also the patch leaves a log when blk_mq_alloc_disk()
is called. With this debug patch, I took dmesg [3].
I interpreted the dmesg as follows. The number of queues allocated for the tag
set increases between blk_mq_quiesce_tagset() and blk_mq_unquiesce_tagset()
calls. Hence the WARN.
...
[ 44.459818] [ T1163] nvme nvme1: resetting controller
[ 44.460220] [ T13] blk_mq_quiesce_tagset: queues=1 -> The tag set has one queue, and it is the quiesce target.
...
[ 44.472749] [ T12] nvme_alloc_ns: calling blk_mq_alloc_disk() -> A new queue is allocated for the tag set.
...
[ 44.500687] [ T13] WARNING: CPU: 2 PID: 13 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0
...
[ 44.539174] [ T13] ---[ end trace 0000000000000000 ]---
[ 44.539967] [ T13] blk_mq_unquiesce_tagset: queues=2 -> Two queues of the tag set are the unquiesce target. One of them was added afger the quiesce, then it causes the warn.
IIUC, this happens because of race between nvme scan and nvme reset. The test
case creates and connects to a subsystem, then do nvme reset soon. The subsystem
creation triggers nvme scan, which adds the queue to the tag set. In parallel,
the nvme reset ioctl context calls blk_mq_quiesce_tagset() and
blk_mq_unquiesce_tagset().
Based on this guess, I created a fix trial patch below. To avoid the race, it
flushes the nvme scan work before starting the nvme reset work. With this patch,
the WARN disappeared (I repeated the test case 20 times to confirm it). Comments
on this fix approach will be appreciated.
With this fix trial patch, the KASAN sauf is still observed. I guess it has
another cause and requires more debug work.
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index aba365f97cf6..d89c89570d11 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2533,6 +2533,9 @@ static void nvme_reset_ctrl_work(struct work_struct *work)
container_of(work, struct nvme_ctrl, reset_work);
int ret;
+ /* prevent racing with ns scanning */
+ flush_work(&ctrl->scan_work);
+
if (nvme_tcp_key_revoke_needed(ctrl))
nvme_auth_revoke_tls_key(ctrl);
nvme_stop_ctrl(ctrl);
[2] debug patch
diff --git a/block/blk-mq.c b/block/blk-mq.c
index c2697db59109..d8d09eb9d2e0 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -344,12 +344,15 @@ EXPORT_SYMBOL_GPL(blk_mq_unquiesce_queue);
void blk_mq_quiesce_tagset(struct blk_mq_tag_set *set)
{
struct request_queue *q;
+ int count = 0;
mutex_lock(&set->tag_list_lock);
list_for_each_entry(q, &set->tag_list, tag_set_list) {
+ count++;
if (!blk_queue_skip_tagset_quiesce(q))
blk_mq_quiesce_queue_nowait(q);
}
+ printk("%s: queues=%d\n", __func__, count);
mutex_unlock(&set->tag_list_lock);
blk_mq_wait_quiesce_done(set);
@@ -359,12 +362,15 @@ EXPORT_SYMBOL_GPL(blk_mq_quiesce_tagset);
void blk_mq_unquiesce_tagset(struct blk_mq_tag_set *set)
{
struct request_queue *q;
+ int count = 0;
mutex_lock(&set->tag_list_lock);
list_for_each_entry(q, &set->tag_list, tag_set_list) {
+ count++;
if (!blk_queue_skip_tagset_quiesce(q))
blk_mq_unquiesce_queue(q);
}
+ printk("%s: queues=%d\n", __func__, count);
mutex_unlock(&set->tag_list_lock);
}
EXPORT_SYMBOL_GPL(blk_mq_unquiesce_tagset);
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6b04473c0ab7..2cf6ffb6e178 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3917,6 +3917,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, struct nvme_ns_info *info)
ctrl->ops->supports_pci_p2pdma(ctrl))
lim.features |= BLK_FEAT_PCI_P2PDMA;
+ printk("%s: calling blk_mq_alloc_disk()", __func__);
disk = blk_mq_alloc_disk(ctrl->tagset, &lim, ns);
if (IS_ERR(disk))
goto out_free_ns;
[3]
[ 44.040470] [ T1002] run blktests nvme/063 at 2025-05-21 19:44:31
[ 44.146521] [ T1111] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 44.155990] [ T1112] nvmet: Allow non-TLS connections while TLS1.3 is enabled
[ 44.162932] [ T1115] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[ 44.266116] [ T1126] nvme nvme1: failed to connect socket: -512
[ 44.273193] [ T51] nvmet_tcp: failed to allocate queue, error -107
[ 44.278519] [ T181] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP.
[ 44.289935] [ T51] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 44.290873] [ T1126] nvme nvme1: qid 0: authenticated
[ 44.384515] [ T68] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349, TLS.
[ 44.386596] [ T1126] nvme nvme1: creating 4 I/O queues.
[ 44.404887] [ T1126] nvme nvme1: mapped 4/0/0 default/read/poll queues.
[ 44.407213] [ T1126] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[ 44.459818] [ T1163] nvme nvme1: resetting controller
[ 44.460220] [ T13] blk_mq_quiesce_tagset: queues=1
[ 44.464021] [ T181] nvmet: Created nvm controller 2 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP.
[ 44.470700] [ T12] nvme nvme1: qid 0: authenticated with hash hmac(sha256) dhgroup ffdhe2048
[ 44.471345] [ T13] nvme nvme1: qid 0: authenticated
[ 44.472749] [ T12] nvme_alloc_ns: calling blk_mq_alloc_disk()
[ 44.477396] [ T221] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349, TLS.
[ 44.481052] [ T13] nvme nvme1: creating 4 I/O queues.
[ 44.500307] [ T13] ------------[ cut here ]------------
[ 44.500687] [ T13] WARNING: CPU: 2 PID: 13 at block/blk-mq.c:330 blk_mq_unquiesce_queue+0x8f/0xb0
[ 44.501065] [ T13] Modules linked in: tls nvmet_tcp nvmet nvme_tcp nvme_fabrics nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr sunrpc ppdev 9pnet_virtio 9pnet netfs i2c_piix4 e1000 pcspkr i2c_smbus parport_pc parport fuse loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci zram bochs drm_client_lib drm_shmem_helper drm_kms_helper xfs nvme drm nvme_core sym53c8xx floppy nvme_keyring nvme_auth scsi_transport_spi serio_raw ata_generic pata_acpi qemu_fw_cfg
[ 44.503782] [ T13] CPU: 2 UID: 0 PID: 13 Comm: kworker/u16:1 Not tainted 6.15.0-rc7+ #39 PREEMPT(voluntary)
[ 44.504223] [ T13] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[ 44.504673] [ T13] Workqueue: nvme-reset-wq nvme_reset_ctrl_work [nvme_tcp]
[ 44.504990] [ T13] RIP: 0010:blk_mq_unquiesce_queue+0x8f/0xb0
[ 44.505239] [ T13] Code: 01 48 89 de bf 09 00 00 00 e8 dd 92 fc ff 48 89 ee 4c 89 e7 e8 b2 df 81 01 48 89 df be 01 00 00 00 5b 5d 41 5c e9 b1 fb ff ff <0f> 0b 5b 48 89 ee 4c 89 e7 5d 41 5c e9 90 df 81 01 e8 0b 1e 83 ff
[ 44.506202] [ T13] RSP: 0018:ffff8881008efb50 EFLAGS: 00010046
[ 44.506530] [ T13] RAX: 0000000000000000 RBX: ffff8881248209c0 RCX: ffffffff916b96e9
[ 44.506913] [ T13] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff888124820b10
[ 44.507312] [ T13] RBP: 0000000000000246 R08: 0000000000000001 R09: ffffed102011df58
[ 44.507763] [ T13] R10: 0000000000000003 R11: ffffffff912f666a R12: ffff888124820ad0
[ 44.508843] [ T13] R13: ffff88810f138078 R14: ffff88810f138108 R15: ffff88810f138450
[ 44.509918] [ T13] FS: 0000000000000000(0000) GS:ffff88841753d000(0000) knlGS:0000000000000000
[ 44.511015] [ T13] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 44.511990] [ T13] CR2: 00007f957426bd70 CR3: 0000000136504000 CR4: 00000000000006f0
[ 44.513022] [ T13] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 44.514086] [ T13] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 44.515190] [ T13] Call Trace:
[ 44.516014] [ T13] <TASK>
[ 44.516824] [ T13] blk_mq_unquiesce_tagset+0xbc/0xe0
[ 44.517689] [ T13] nvme_tcp_setup_ctrl.cold+0x6f2/0xc89 [nvme_tcp]
[ 44.518633] [ T13] ? __pfx_nvme_tcp_setup_ctrl+0x10/0x10 [nvme_tcp]
[ 44.519564] [ T13] ? _raw_spin_unlock_irqrestore+0x35/0x60
[ 44.520454] [ T13] ? nvme_change_ctrl_state+0x1a1/0x2e0 [nvme_core]
[ 44.521408] [ T13] nvme_reset_ctrl_work+0x1a1/0x250 [nvme_tcp]
[ 44.522289] [ T13] process_one_work+0x84f/0x1460
[ 44.523157] [ T13] ? __pfx_process_one_work+0x10/0x10
[ 44.524012] [ T13] ? assign_work+0x16c/0x240
[ 44.524854] [ T13] worker_thread+0x5ef/0xfd0
[ 44.525660] [ T13] ? __kthread_parkme+0xb4/0x200
[ 44.526508] [ T13] ? __pfx_worker_thread+0x10/0x10
[ 44.527315] [ T13] kthread+0x3b0/0x770
[ 44.528110] [ T13] ? __pfx_kthread+0x10/0x10
[ 44.528882] [ T13] ? ret_from_fork+0x17/0x70
[ 44.529696] [ T13] ? ret_from_fork+0x17/0x70
[ 44.530451] [ T13] ? _raw_spin_unlock_irq+0x24/0x50
[ 44.531232] [ T13] ? __pfx_kthread+0x10/0x10
[ 44.531982] [ T13] ret_from_fork+0x30/0x70
[ 44.532734] [ T13] ? __pfx_kthread+0x10/0x10
[ 44.533464] [ T13] ret_from_fork_asm+0x1a/0x30
[ 44.534217] [ T13] </TASK>
[ 44.534848] [ T13] irq event stamp: 520990
[ 44.535526] [ T13] hardirqs last enabled at (520989): [<ffffffff93e98974>] _raw_spin_unlock_irq+0x24/0x50
[ 44.536455] [ T13] hardirqs last disabled at (520990): [<ffffffff93e79f1d>] __schedule+0x2fad/0x5fa0
[ 44.537356] [ T13] softirqs last enabled at (520682): [<ffffffff91518119>] __irq_exit_rcu+0x109/0x210
[ 44.538264] [ T13] softirqs last disabled at (520437): [<ffffffff91518119>] __irq_exit_rcu+0x109/0x210
[ 44.539174] [ T13] ---[ end trace 0000000000000000 ]---
[ 44.539967] [ T13] blk_mq_unquiesce_tagset: queues=2
[ 44.546746] [ T13] nvme nvme1: mapped 4/0/0 default/read/poll queues.
[ 44.567319] [ T1182] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1"
[ 44.569415] [ T1026] block nvme1n1: no available path - failing I/O
[ 44.570557] [ T1026] block nvme1n1: no available path - failing I/O
[ 44.571222] [ T1026] Buffer I/O error on dev nvme1n1, logical block 262030, async page read
[ 44.617517] [ T1182] blk_mq_quiesce_tagset: queues=1
[ 44.620374] [ T1182] blk_mq_unquiesce_tagset: queues=1
[ 44.887578] [ T180] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP.
[ 44.904335] [ T48] nvme nvme1: qid 0: authenticated with hash hmac(sha384) dhgroup ffdhe3072
[ 44.905297] [ T1195] nvme nvme1: qid 0: authenticated
[ 44.908055] [ T1195] nvme nvme1: failed to connect socket: -512
[ 44.915363] [ T51] nvmet_tcp: failed to allocate queue, error -107
[ 44.920798] [ T180] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349 with DH-HMAC-CHAP.
[ 44.935868] [ T51] nvme nvme1: qid 0: authenticated with hash hmac(sha384) dhgroup ffdhe3072
[ 44.936914] [ T1195] nvme nvme1: qid 0: authenticated
[ 44.977465] [ T180] nvmet: Created nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349, TLS.
[ 44.980717] [ T1195] nvme nvme1: creating 4 I/O queues.
[ 44.998589] [ T1195] nvme nvme1: mapped 4/0/0 default/read/poll queues.
[ 45.001977] [ T1195] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nqn.2014-08.org.nvmexpress:uuid:0f01fb42-9f7f-4856-b0b3-51e60b8de349
[ 45.083722] [ T1234] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1"
[ 47.457819] [ T13] nvme_alloc_ns: calling blk_mq_alloc_disk()
[ 47.478144] [ T1234] blk_mq_quiesce_tagset: queues=1
[ 47.483764] [ T1234] blk_mq_unquiesce_tagset: queues=1
More information about the Linux-nvme
mailing list