[PATCH] nvme-mpath: fix I/O failure with EAGAIN when failing over I/O

Martin Wilck mwilck at suse.com
Tue Jun 20 02:15:40 PDT 2023


Hello Sagi,

On Mon, 2023-06-19 at 17:10 +0300, Sagi Grimberg wrote:
> It is possible that the next available path we failover to, happens
> to
> be frozen (for example if it is during connection establishment). If
> the original I/O was set with NOWAIT, this cause the I/O to
> unnecessarily
> fail because the request queue cannot be entered, hence the I/O fails
> with
> EAGAIN.
> 
> The NOWAIT restriction that was originally set for the I/O is no
> longer
> relevant or needed because this is the nvme requeue context. Hence we
> clear the REQ_NOWAIT flag when failing over I/O.

Could you please explain this in more detail? We are on the bio level,
thus IIUC a new request will need to be allocated when the bio is
requeued. This means that if the fail-over queue is frozen e.g. during
a NVMe controller reset, IO may be blocked for a possibly very long
time, which is what the NOWAIT flag was initially supposed to avoid. 

I am asking because we've seen a similar phenomenon with a 3rd party
multipath implementation recently.


Regards
Martin

> This fix a simple test case of nvme controller reset during I/O when
> the
> multipath device that has only a single path and I/O fails with
> "Resource
> temporarily unavailable" errno. Note that this reproduces with
> io_uring
> which by default sets IOCB_NOWAIT by default.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>  drivers/nvme/host/multipath.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/nvme/host/multipath.c
> b/drivers/nvme/host/multipath.c
> index 2bc159a318ff..6425e6ec3932 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -102,6 +102,7 @@ void nvme_failover_req(struct request *req)
>         spin_lock_irqsave(&ns->head->requeue_lock, flags);
>         for (bio = req->bio; bio; bio = bio->bi_next) {
>                 bio_set_dev(bio, ns->head->disk->part0);
> +               bio->bi_opf &= ~REQ_NOWAIT;
>                 if (bio->bi_opf & REQ_POLLED) {
>                         bio->bi_opf &= ~REQ_POLLED;
>                         bio->bi_cookie = BLK_QC_T_NONE;




More information about the Linux-nvme mailing list