[RFC] nvme-mpath: delete disk after last connection

Sagi Grimberg sagi at grimberg.me
Tue Sep 29 04:28:51 EDT 2020


>>> I'm okay with that in general, but then again we might run into situations
>>> where an 'all paths down' scenario is actually expected (think of a
>>> temporary network outage on nvme-tcp).
>>> So I guess we need to introduce an additional setting (queue_if_no_path?)
>>> to be specified during the initial connection.
>>
>> The original design was pretty much intentional as and all paths down
>> even usually is temporary.  So any change in behavior should be based
>> on an optional instead of a change in default.  And I think the right
>> way to do implement this would be a timer when to take the gendisk
>> down, with the default remaining infinitity.
> 
> Ok, we can let the timer be an option for fabrics so it can be specific
> to the subsystem you're connecting to. The timeout for pcie targets
> probably has to be a module parameter, though.

Something here is not clear to me, we are not really talking about "all
paths down" but rather "all paths lost", which should take the gendisk
down AFAIK.

Keith, shouldn't we modify the gendisk reference manipulation in
nvme_mpath_remove_disk instead of moving it to the call sites?

I'm still unsure why we need a timer for this. If a path is removed
(e.g. disconnected, we shouldn't keep the gendisk up).



More information about the Linux-nvme mailing list