[PATCH RFC 3/3] nvme: delay failover by command quiesce timeout

Wed Apr 16 01:30:11 PDT 2025

On Tue, Apr 15, 2025 at 05:40:16PM -0700, Mohamed Khalfella wrote:
> On 2025-04-15 14:17:48 +0200, Daniel Wagner wrote:
> > Pasthrough commands should fail immediately. Userland is in charge here,
> > not the kernel. At least this what should happen here.
> 
> I see your point. Unless I am missing something these requests should be
> held equally to bio requests from multipath layer. Let us say app
> submitted write a request that got canceled immediately, how does the app
> know when it is safe to retry the write request?

Good question, but nothing new as far I can tell. If the kernel doesn't
start to retry passthru IO commands, we have to figure out how to pass
additional information to the userland.

> Holding requests like write until it is safe to be retried is the whole
> point of this work, right?

My first goal was to address the IO commands submitted via the block
layer. I didn't had the IO passthru interface on my radar. I agree,
getting the IO passthru path correct is also good idea.

Anyway, I don't have a lot of experience with the IO passthru interface.
A quick test to fitgure out what the status quo is: 'nvme read'
results in an EINTR or 'Resource temporarily unavailable' (EAGAIN or
EWOULDBLOCK) when a tcp connection between host and target is blocked
(*), though it comes from an admin command. It looks like I have to dive
into this a bit more.

(*) I think this might be the same problem as the systemd folks have
reported.