[PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs

Thu Mar 18 19:31:35 GMT 2021

>> Placing the request on the requeue_list is fine, but the question is
>> when to kick the requeue_work, nothing guarantees that an alternate path
>> exist or will in a sane period. So constantly requeue+kick sounds like
>> a really bad practice to me.
> 
> nvme_mpath_set_live(), where you reported the deadlock, kicks the
> requeue_list. The difference that NOWAIT provides is that
> nvme_mpath_set_live's schronize_srcu() is no longer blocked forever
> because the .submit_bio() isn't waiting for entery on a frozen queue, so
> now it's free to schedule the dispatch.
> 
> There's probably an optimization to kick it sooner if there's a viable
> alternate path, but that could be a follow on.

That would be mandatory I think, otherwise this would introduce
a regression...

> If there's no immediate viable path, then the requests would remain on
> the requeue list. That currently happens as long as there's a potential
> controller in a reset or connecting state.

Well, also worth to keep in mind that now we'll need to clone the bio
because we need to override bi_end_io which adds us some overhead
in the data path. Unless we make submit_bio return a status which
is a much bigger scope of a change I would expect...