[PATCH RFC 5/5] block, nvme: add failed_bio callback for multipath bio failover
Christoph Hellwig
hch at lst.de
Wed May 20 00:27:46 PDT 2026
On Tue, May 19, 2026 at 10:23:26AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch at kernel.org>
>
> The nvme driver has long utilized a zero capacity to indicate the path
> isn't reachable, which creates a race condition with IO dispatch when
> paths are being detached on a live system: when the block layer rejects
> a bio early due to a capacity check failure, drivers with multipath
> support using the original bio have no interception point to redirect
> the bio to another path.
Trying to reverse-engineer - the problem is that the block-layer
code catches being beyond the capacity and directly completes the bio,
right?
IMHO the right fix is to get rid of the capacity hacks, and have a flag
we can catch in the nvme driver and complete through the mechanisms.
Having a callback into the driver for a specific error codition seems
odd. And I think there's a bunch of cases where we could call this on
a bio the driver hasn't even seen yet, including from file system code.
More information about the Linux-nvme
mailing list