[PATCH v3 2/2] nvme: use blk_mq_[un]quiesce_tagset

Chao Leng lengchao at huawei.com
Thu Oct 20 02:47:46 PDT 2022



On 2022/10/20 14:11, Sagi Grimberg wrote:
> 
> 
> On 10/20/22 06:53, Chao Leng wrote:
>> All controller namespaces share the same tagset, so we can use this
>> interface which does the optimal operation for parallel quiesce based on
>> the tagset type(e.g. blocking tagsets and non-blocking tagsets).
>>
>> nvme connect_q should not be quiesced when quiesce tagset, so set the
>> QUEUE_FLAG_SKIP_TAGSET_QUIESCE to skip it when init connect_q.
>>
>> Currntely we use NVME_NS_STOPPED to ensure pairing quiescing and
>> unquiescing. If use blk_mq_[un]quiesce_tagset, NVME_NS_STOPPED will be
>> invalided, so introduce NVME_CTRL_STOPPED to replace NVME_NS_STOPPED.
>> In addition, we never really quiesce a single namespace. It is a better
>> choice to move the flag from ns to ctrl.
>>
>> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>> Signed-off-by: Chao Leng <lengchao at huawei.com>
>> ---
>>   drivers/nvme/host/core.c | 57 +++++++++++++++++++-----------------------------
>>   drivers/nvme/host/nvme.h |  3 ++-
>>   2 files changed, 25 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 059737c1a2c1..c7727d1f228e 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4890,6 +4890,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
>>               ret = PTR_ERR(ctrl->connect_q);
>>               goto out_free_tag_set;
>>           }
>> +        blk_queue_flag_set(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, ctrl->connect_q);
>>       }
>>       ctrl->tagset = set;
>> @@ -5013,6 +5014,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
>>       clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
>>       spin_lock_init(&ctrl->lock);
>>       mutex_init(&ctrl->scan_lock);
>> +    mutex_init(&ctrl->queue_state_lock);
> 
> Why is this lock needed?
It is used to secure the process which need that the queue must be quiesced.
The scenario:(without the lock)
Thread A: call nvme_stop_queues and set the NVME_CTRL_STOPPED and then quiesce the tagset
           and wait the grace period.
Thread B: call nvme_stop_queues, because the NVME_CTRL_STOPPED is already setted,
           continue to do something which need that the queue must be quiesced,
           because the grace period of the queue is not ended, may cause abnormal.
Thread A: the grace period end, and continue.
So add a lock to ensure that all queues are quiesced after set the NVME_CTRL_STOPPED.

The old code was implemented by forcing a wait for the grace period. Show the code:
	if (!test_and_set_bit(NVME_NS_STOPPED, &ns->flags))
		blk_mq_quiesce_queue(ns->queue);
	else
		blk_mq_wait_quiesce_done(ns->queue);
The old code was not absolutely safe, such as this scenario:
Thread A: test_and_set_bit, and interrupt by hardware irq, lost chance to run.
Thread B: test_and_set_bit, and wait the grace period, and then continue
           to do something which need that the queue must be quiesced,
           because the queue is not quiesced, may cause abnormal.
Thread A: get the chance to run, blk_mq_quiesce_queue, and then continue.
> 
> .



More information about the Linux-nvme mailing list