[PATCH] nvme: do not ignore nvme status in nvme_set_queue_count()
Hannes Reinecke
hare at suse.de
Fri Jan 22 11:35:35 EST 2021
On 1/21/21 9:14 PM, Keith Busch wrote:
> On Thu, Jan 21, 2021 at 10:50:21AM +0100, Hannes Reinecke wrote:
>> If the call to nvme_set_queue_count() fails with a status we should
>> not ignore it but rather pass it on to the caller.
>> It's then up to the transport to decide whether to ignore it
>> (like PCI does) or to reset the connection (as would be appropriate
>> for fabrics).
>
> Instead of checking the error, wouldn't checking the number of created
> queues be sufficient? What handling difference do you expect to occur
> between getting a success with 0 queues, vs getting an error?
>
The difference is that an error will (re-)start recovery, 0 queues won't.
But the problem here is that nvme_set_queue_count() is being called
during reconnection, ie during the recovery process itself.
And this command is returned with a timeout, which in any other case is
being treated as a fatal error. Plus we have been sending this command
on the admin queue, so a timeout on the admin queue pretty much _is_ a
fatal error. So we should be terminating the current recovery and
reconnect. None of that will happen if we return '0' queues.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
More information about the Linux-nvme
mailing list