nvme reservation commands during controller reset

Sagi Grimberg sagi at grimberg.me
Mon Aug 17 15:29:49 EDT 2020


>>>>> Amit,
>>>>>
>>>>> can you try the branch below?  Pretty much hot off the press, but I
>>>>> think this should address your problem:
>>>>>
>>>>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/nvme-pr-fix
>>>>
>>>> I like the approach, but I think we'll need a bit more path awareness
>>>> like we check for available paths if we requeue or fail..
>>>
>>> If there is no available path, nvme_find_path will fail and thus we'll
>>> error it.  This is the same mechanism as used by
>>> nvme_ns_head_submit_bio.
>>
>> nvme_find_path will return ns=NULL even if none of the paths is LIVE,
>> but we want to requeue if we have an available path (e.g. ANA state is
>> inaccessible temporarily or ctrl state is RESETTING/CONNECTING).
> 
> nvme_find_path will return a namespace if there is a namespace with
> an optimized or non-optimized state, and which does not have the
> NVME_NS_ANA_PENDING or NVME_NS_REMOVING flags set on a controller that is
> in the live or deleting states.
> 
> And that is exactly what nvme_ns_head_submit_bio relies on.

Agree.

>> Only if no path is available for request execution we are failing the
>> request.
>>
>> If we are providing multipathing for reservations, we should give
>> the same efforts as we do for normal I/O. This could mean waiting
>> for some indication on the ns path states (wait for a completion in
>> nvme_submit_sync_cmd_disk and wake it up in nvme_mpath_set_live).
> 
> The only difference in nvme_ns_head_submit_bio is that the bio is
> queued up if the controller is in a suitable state.

These are details, let's first discuss the semantics we should be
providing.

   But that isn't
> something we can implement for the passthrough path as we don't have
> an inidividual bio that we can queue up.  Note that not other retries
> are handled by the pr path either, so there is the need for some
> amount of retrying in the caller anyway.

So what is the semantics we want to provide? a best-effort one? not very
different than what we have today. We could, if we choose to, have the
pr context to wait for a path_map_changed completion and have
nvme_mpath_set_live and nvme_mpath_check_last_path do complete_all on
this completion.

But the question is what the semantics we want to provide...



More information about the Linux-nvme mailing list