Q: nvme_rdma and reconnect

Hannes Reinecke hare at suse.de
Thu Dec 22 04:46:10 PST 2016


On 12/22/2016 01:08 PM, Sagi Grimberg wrote:
>
>>> Sagi, Christoph,
>>>
>>> Can you explain what the difference is between the "reset" path and the
>>> "error/reconnect" path is in the rdma driver.  From my point of view, it
>>> would seem both, relative to the fabric-side of the transport, are
>>> terminating the controller and reconnecting to a new controller to
>>> recover.
>>> So why wouldn't they be the same (single) reset flow ?
>>
>> They should use the same flow.  A couple month ago I had a prototype
>> for that but never got it to work fully.
>
> One more distinction is that reconnect failures will retry periodically
> while reset failure will remove the device (aligns with the pci driver
> behavior).
>
> We can go via the same flow and condition on the state for the
> differences, but I'm not sure its easier to understand than two
> distinct routines (although that share a lot of code).
>
And keeping in mind that the reset path will be a killer for any 
prospective multipath scenario; if you need to remove the device to 
reset you are guaranteed to _never_ get it back under memory pressure.

So please do not enforce a reset for all cases.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare at suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



More information about the Linux-nvme mailing list