[PATCH 0/3] improve nvme quiesce time for large amount of namespaces

Sagi Grimberg sagi at grimberg.me
Wed Oct 12 04:13:07 PDT 2022



On 10/12/22 11:43, Chao Leng wrote:
> Add Ming Lei.
> 
> On 2022/10/12 14:37, Sagi Grimberg wrote:
>>
>>>> On Sun, Jul 31, 2022 at 01:23:36PM +0300, Sagi Grimberg wrote:
>>>>> But maybe we can avoid that, and because we allocate
>>>>> the connect_q ourselves, and fully know that it should
>>>>> not be apart of the tagset quiesce, perhaps we can introduce
>>>>> a new interface like:
>>>>> -- 
>>>>> static inline int nvme_ctrl_init_connect_q(struct nvme_ctrl *ctrl)
>>>>> {
>>>>>     ctrl->connect_q = blk_mq_init_queue_self_quiesce(ctrl->tagset);
>>>>>     if (IS_ERR(ctrl->connect_q))
>>>>>         return PTR_ERR(ctrl->connect_q);
>>>>>     return 0;
>>>>> }
>>>>> -- 
>>>>>
>>>>> And then blk_mq_quiesce_tagset can simply look into a per 
>>>>> request-queue
>>>>> self_quiesce flag and skip as needed.
>>>>
>>>> I'd just make that a queue flag set after allocation to keep the
>>>> interface simple, but otherwise this seems like the right thing
>>>> to do.
>>> Now the code used NVME_NS_STOPPED to avoid unpaired stop/start.
>>> If we use blk_mq_quiesce_tagset, It will cause the above mechanism to 
>>> fail.
>>> I review the code, only pci can not ensure secure stop/start pairing.
>>> So there is a choice, We only use blk_mq_quiesce_tagset on fabrics, 
>>> not PCI.
>>> Do you think that's acceptable?
>>> If that's acceptable, I will try to send a patch set.
>>
>> I don't think that this is acceptable. But I don't understand how
>> NVME_NS_STOPPED would change anything in the behavior of tagset-wide
>> quiesce?
> If use blk_mq_quiesce_tagset, it will quiesce all queues of all ns,
> but can not set NVME_NS_STOPPED of all ns. The mechanism of NVME_NS_STOPPED
> will be invalidated.
> NVMe-pci has very complicated quiesce/unquiesce use pattern, 
> quiesce/unquiesce
> may be called unpaired.
> It will cause some backward. There may be some bugs in this scenario:
> A thread: quiesce the queue
> B thread: quiesce the queue
> A thread end, and does not unquiesce the queue.
> B thread: unquiesce the queue, and do something which need the queue 
> must be unquiesed.
> 
> Of course, I don't think it is a good choice to guarantee paired access 
> through NVME_NS_STOPPED,
> there exist unexpected unquiesce and start queue too early.
> But now that the code has done so, the backward should be unacceptable.
> such as this scenario:
> A thread: quiesce the queue
> B thread: want to quiesce the queue but do nothing because 
> NVME_NS_STOPPED is already set.
> A thread: unquiesce the queue
> Now the queue is unquiesced too early for B thread.
> B thread: do something which need the queue must be quiesced.
> 
> Introduce NVME_NS_STOPPED link:
> https://lore.kernel.org/all/20211014081710.1871747-5-ming.lei@redhat.com/

I think we can just change a ns flag to a controller flag ala:
NVME_CTRL_STOPPED, and then do:

void nvme_stop_queues(struct nvme_ctrl *ctrl)
{
	if (!test_and_set_bit(NVME_CTRL_STOPPED, &ns->flags))
		blk_mq_quiesce_tagset(ctrl->tagset);
}
EXPORT_SYMBOL_GPL(nvme_stop_queues);

void nvme_start_queues(struct nvme_ctrl *ctrl)
{
	if (test_and_clear_bit(NVME_CTRL_STOPPED, &ns->flags))
		blk_mq_unquiesce_tagset(ctrl->tagset);
}
EXPORT_SYMBOL_GPL(nvme_start_queues);

Won't that achieve the same result?



More information about the Linux-nvme mailing list