[PATCH 0/3 rfc] Fix nvme-tcp and nvme-rdma controller reset hangs

Christoph Hellwig hch at lst.de
Fri Mar 19 14:05:32 GMT 2021


On Fri, Mar 19, 2021 at 06:52:56AM +0900, Keith Busch wrote:
> Having submit_bio() return the enter status was where I was going with
> this, but the recursive handling makes this more complicated than I
> initially thought.

Note that the recursion handling is not really required for
nvme-multipath.  I have some plans to actually kill it off entirely
for blk-mq submissions, which needs work on the bounce buffering
and bio splitting code, but should not be too hard.

> If you use the NOWAIT flag today with a freezing queue, the IO will end
> with BLK_STS_AGAIN and punt retry handling to the application. I'm
> guessing you don't want that to happen, so a little more is required for
> this idea.

We really should not use NOWAIT but a flag that only escapes the
freeze protection.  I think REQ_FAILFAST_DRIVER should probably be changed
to trigger that, but even if not we could add a new flag.

> Since it's an error path, perhaps a block operations callback is okay?
> Something like this compile tested patch?

We really should not need an indirection.  And more importantly I don't
think the consuming driver cares, it really is the submitting one.



More information about the Linux-nvme mailing list