[PATCH 0/3] improve nvme quiesce time for large amount of namespaces

Sun Jul 31 03:23:36 PDT 2022

>> Why can't we have a per-tagset quiesce flag and just wait for the
>> one?  That also really nicely supports the problem with changes in
>> the namespace list during that time.
> Because If quiesce queues based on tagset, it is difficult to
> distinguish non-IO queues. The I/O queues process is different
> from other queues such as fabrics_q, admin_q, etc, which may cause
> confusion in the code logic.

It is primarily the connect_q where we issue io queue connect...
We should not quiesce the connect_q in nvme_stop_queues() as that
relates to only namespaces queues.

In the last attempt to do a tagset flag, we ended up having to do
something like:
--
void nvme_stop_queues(struct nvme_ctrl *ctrl)
{
	blk_mq_quiesce_tagset(ctrl->tagset);
	if (ctrl->connect_q)
		blk_mq_unquiesce_queue(ctrl->connect_q);
}
EXPORT_SYMBOL_GPL(nvme_stop_queues);
--

But maybe we can avoid that, and because we allocate
the connect_q ourselves, and fully know that it should
not be apart of the tagset quiesce, perhaps we can introduce
a new interface like:
--
static inline int nvme_ctrl_init_connect_q(struct nvme_ctrl *ctrl)
{
	ctrl->connect_q = blk_mq_init_queue_self_quiesce(ctrl->tagset);
	if (IS_ERR(ctrl->connect_q))
		return PTR_ERR(ctrl->connect_q);
	return 0;
}
--

And then blk_mq_quiesce_tagset can simply look into a per request-queue
self_quiesce flag and skip as needed.