[PATCH 0/3] improve nvme quiesce time for large amount of namespaces

Chao Leng lengchao at huawei.com
Sun Jul 31 18:45:57 PDT 2022



On 2022/7/31 18:23, Sagi Grimberg wrote:
> 
>>> Why can't we have a per-tagset quiesce flag and just wait for the
>>> one?  That also really nicely supports the problem with changes in
>>> the namespace list during that time.
>> Because If quiesce queues based on tagset, it is difficult to
>> distinguish non-IO queues. The I/O queues process is different
>> from other queues such as fabrics_q, admin_q, etc, which may cause
>> confusion in the code logic.
> 
> It is primarily the connect_q where we issue io queue connect...
> We should not quiesce the connect_q in nvme_stop_queues() as that
> relates to only namespaces queues.
Although we can do special processing for connect_q, fabrics_q, admin_q,
but this results in redundant semantics being implemented in
nvme_xxx_teardown_io_queues, these actions are confused for
nvme_xxx_teardown_admin_queue. It doesn't look clear.
Therefor, I think quiesceing queues based on namespaces is a better option.
In addition, I do not see the benefit of quiesceing queues based on tagset.
> 
> In the last attempt to do a tagset flag, we ended up having to do
> something like:
> -- 
> void nvme_stop_queues(struct nvme_ctrl *ctrl)
> {
>      blk_mq_quiesce_tagset(ctrl->tagset);
>      if (ctrl->connect_q)
>          blk_mq_unquiesce_queue(ctrl->connect_q);
> }
> EXPORT_SYMBOL_GPL(nvme_stop_queues);
> -- 
> 
> But maybe we can avoid that, and because we allocate
> the connect_q ourselves, and fully know that it should
> not be apart of the tagset quiesce, perhaps we can introduce
> a new interface like:
> -- 
> static inline int nvme_ctrl_init_connect_q(struct nvme_ctrl *ctrl)
> {
>      ctrl->connect_q = blk_mq_init_queue_self_quiesce(ctrl->tagset);
>      if (IS_ERR(ctrl->connect_q))
>          return PTR_ERR(ctrl->connect_q);
>      return 0;
> }
> -- 
> 
> And then blk_mq_quiesce_tagset can simply look into a per request-queue
> self_quiesce flag and skip as needed.
> .



More information about the Linux-nvme mailing list