[PATCH] nvme-rdma: fix crash for no IO queues

Chao Leng lengchao at huawei.com
Sat Feb 27 04:30:46 EST 2021



On 2021/2/27 17:12, Hannes Reinecke wrote:
> On 2/24/21 6:59 AM, Chao Leng wrote:
>>
>>
>> On 2021/2/24 7:21, Keith Busch wrote:
>>> On Tue, Feb 23, 2021 at 03:26:02PM +0800, Chao Leng wrote:
>>>> A crash happens when set feature(NVME_FEAT_NUM_QUEUES) timeout in nvme
>>>> over rdma(roce) reconnection, the reason is use the queue which is not
>>>> alloced.
>>>>
>>>> If it is not discovery and no io queues, the connection should fail.
>>>
>>> If you're getting a timeout, we need to quit initialization. Hannes
>>> attempted making that status visible for fabrics here:
>>>
>>> http://lists.infradead.org/pipermail/linux-nvme/2021-January/022353.html
>> I know the patch. It can not solve the scenario: target may be an
>> attacker or the target behavior is incorrect.
>> If target return 0 io queues or return other error code, the crash will
>> still happen. We should not allow this to happen.
> I'm fully with you that we shouldn't crash, but at the same time a value of '0' for the number of I/O queues is considered valid.
> So we should fix the code to handle this scenario, and not disallowing zero I/O queues.
'0' I/O queues doesn't make any sense to nvme over fabrics, it is
different with nvme over pci. If there is some bug with target, we can
debug it in target instead of use admin queue in host.
target may be an attacker or the target behavior is incorrect. So we
should avoid crash. Another option: prohibit  request delivery if
io queue do not created.
I think failed connection with '0' I/O queues is a better choice.
> 
> Cheers,
> 
> Hannes



More information about the Linux-nvme mailing list