[PATCH v2 3/3] nvme-rdma: Handle number of queue changes

Chao Leng lengchao at huawei.com
Mon Aug 29 01:02:03 PDT 2022



On 2022/8/28 20:20, Sagi Grimberg wrote:
> 
>>> On Fri, Aug 26, 2022 at 03:31:15PM +0800, Chao Leng wrote:
>>>>> After seeing both version I tend to do say the first one keeps the
>>>>> 'wierd' stuff more closer together and doesn't make the callside of
>>>>> nvme_rdma_start_io_queues() do the counting. So my personal preference
>>>> I don't understand "do the counting".
>>>
>>> Sorry. I meant we fist start only queues for which have resources
>>> allocated (nr_queues in my patch). And then we only need to start
>>> potentially added queues.
>>>
>>>> Show the code:
>>>> ---
>>>>   drivers/nvme/host/rdma.c | 9 ++++-----
>>>>   1 file changed, 4 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>>> index 7d01fb770284..8dfb79726e13 100644
>>>> --- a/drivers/nvme/host/rdma.c
>>>> +++ b/drivers/nvme/host/rdma.c
>>>> @@ -980,10 +980,6 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
>>>>                          goto out_free_tag_set;
>>>>          }
>>>>
>>>> -       ret = nvme_rdma_start_io_queues(ctrl);
>>>> -       if (ret)
>>>> -               goto out_cleanup_connect_q;
>>>
>>> Again, these need to start so that...
>>>
>>>> -
>>>>          if (!new) {
>>>>                  nvme_start_queues(&ctrl->ctrl);
>>>>                  if (!nvme_wait_freeze_timeout(&ctrl->ctrl,
>>>
>>> ... this here has a chance to work.
>> Some request will be submited, and will failed, and then retry
>> or failover. It is similar to nvme_cancel_tagset in nvme_rdma_teardown_io_queues.
>> I think it is acceptable.
> 
> Not really...
> 
> In order for the queue freeze to complete, all pending IOs need
> to complete or error out, and that cannot be guaranteed without
> restarting the queues as some may be waiting on tags and need to
> be restarted in order to complete.
> 
> See 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
Yes, we need restart the queue before wait the queue freeze.
But it is not necessary to nvme_rdma_start_io_queues before nvme_start_queues.
Certainly there is a downside: the requests which are waiting on tags
will all failed.
> .



More information about the Linux-nvme mailing list