[PATCH v2] nvmet: force reconnect when number of queue changes

Sagi Grimberg sagi at grimberg.me
Thu Oct 6 04:37:32 PDT 2022


>>> As far I can tell, what's is missing from a testing point of view is the
>>> ability to fail requests without the DNR bit set or the ability to tell
>>> the host to reconnect. Obviously, an AEN would be nice for this but I
>>> don't know if this is reason enough to extend the spec.
>>
>> Looking into the code, its the connect that fails on invalid parameter
>> with a DNR, because the host is attempting to connect to a subsystems
>> that does not exist on the port (because it was taken offline for
>> maintenance reasons).
>>
>> So I guess it is valid to allow queue change without removing it from
>> the port, but that does not change the fundamental question on DNR.
>> If the host sees a DNR error on connect, my interpretation is that the
>> host should not retry the connect command itself, but it shouldn't imply
>> anything on tearing down the controller and giving up on it completely,
>> forever.
> 
> Okay, let me try to avoid the DNR discussion for now and propose
> something else? What about adding a 'enable' attribute to the subsys?
> 
> The snipped below does the trick. Though There is no explicit
> synchronization between host and target, so it's possible the host
> doesn't notice that the subsystem toggled enabled and updated the number
> queues. But not sure if it's worth to address this, it feels a bit
> over-engineered.

I think that for the matter of this patch, you can keep force reconnect.

But I still think we need to be consistent with the different transports
on how we interperet controller returning DNR...



More information about the Linux-nvme mailing list