[PATCH 0/3] improve nvme quiesce time for large amount of namespaces
Sagi Grimberg
sagi at grimberg.me
Sun Jul 31 03:23:36 PDT 2022
>> Why can't we have a per-tagset quiesce flag and just wait for the
>> one? That also really nicely supports the problem with changes in
>> the namespace list during that time.
> Because If quiesce queues based on tagset, it is difficult to
> distinguish non-IO queues. The I/O queues process is different
> from other queues such as fabrics_q, admin_q, etc, which may cause
> confusion in the code logic.
It is primarily the connect_q where we issue io queue connect...
We should not quiesce the connect_q in nvme_stop_queues() as that
relates to only namespaces queues.
In the last attempt to do a tagset flag, we ended up having to do
something like:
--
void nvme_stop_queues(struct nvme_ctrl *ctrl)
{
blk_mq_quiesce_tagset(ctrl->tagset);
if (ctrl->connect_q)
blk_mq_unquiesce_queue(ctrl->connect_q);
}
EXPORT_SYMBOL_GPL(nvme_stop_queues);
--
But maybe we can avoid that, and because we allocate
the connect_q ourselves, and fully know that it should
not be apart of the tagset quiesce, perhaps we can introduce
a new interface like:
--
static inline int nvme_ctrl_init_connect_q(struct nvme_ctrl *ctrl)
{
ctrl->connect_q = blk_mq_init_queue_self_quiesce(ctrl->tagset);
if (IS_ERR(ctrl->connect_q))
return PTR_ERR(ctrl->connect_q);
return 0;
}
--
And then blk_mq_quiesce_tagset can simply look into a per request-queue
self_quiesce flag and skip as needed.
More information about the Linux-nvme
mailing list