[PATCH] nvme-fabrics: fix crash for no IO queues

Tue Mar 16 20:57:16 GMT 2021

On 3/15/2021 10:08 PM, Sagi Grimberg wrote:
> 
>>>>>>> A crash happens when set feature(NVME_FEAT_NUM_QUEUES) timeout in 
>>>>>>> nvme
>>>>>>> over rdma(roce) reconnection, the reason is use the queue which 
>>>>>>> is not
>>>>>>> alloced.
>>>>>>>
>>>>>>> If queue is not live, should not allow queue request.
>>>>>>
>>>>>> Can you describe exactly the scenario here? What is the state
>>>>>> here? LIVE? or DELETING?
>>>>> If seting feature(NVME_FEAT_NUM_QUEUES) failed due to time out or
>>>>> the target return 0 io queues, nvme_set_queue_count will return 0,
>>>>> and then reconnection will continue and success. The state of 
>>>>> controller
>>>>> is LIVE. The request will continue to deliver by call ->queue_rq(),
>>>>> and then crash happens.
>>>>
>>>> Thinking about this again, we should absolutely fail the reconnection
>>>> when we are unable to set any I/O queues, it is just wrong to
>>>> keep this controller alive...
>>> Keith think keeping the controller alive for diagnose is better.
>>> This is the patch which failed the connection.
>>> https://lore.kernel.org/linux-nvme/20210223072602.3196-1-lengchao@huawei.com/ 
>>>
>>>
>>> Now we have 2 choice:
>>> 1.failed the connection when unable to set any I/O queues.
>>> 2.do not allow queue request when queue is not live.
>>
>> Okay, so there are different views on how to handles this. I 
>> personally find
>> in-band administration for a misbehaving device is a good thing to 
>> have, but I
>> won't 'nak' if the consensus from the people using this is for the 
>> other way.
> 
> While I understand that this can be useful, I've seen it do more harm
> than good. It is really puzzling to people when the controller state
> reflected is live (and even optimized) and no I/O is making progress for
> unknown reason. And logs are rarely accessed in these cases.
> 
> I am also opting for failing it and rescheduling a reconnect.

Agree with Sagi. We also hit this issue a long time ago and I made the 
same change (commit 834d3710a093a) that Sagi is suggesting:  if the 
prior controller instance had io queues, but the new/reconnected 
controller fails to create io queues, then the controller create is 
failed and a reconnect is scheduled.

-- james