[PATCH 1/1] nvme-multipath: Skip nr_active increments in RETRY disposition

Mohamed Khalfella mkhalfella at purestorage.com
Thu Sep 25 08:59:50 PDT 2025


On 2025-09-25 08:43:44 -0600, Keith Busch wrote:
> On Wed, Sep 24, 2025 at 06:14:27PM -0700, Mohamed Khalfella wrote:
> > On 2025-09-24 17:02:51 -0600, Keith Busch wrote:
> > > On Wed, Sep 24, 2025 at 03:43:18PM -0700, Amit Chaudhary wrote:
> > > >  static inline void nvme_start_request(struct request *rq)
> > > >  {
> > > > -	if (rq->cmd_flags & REQ_NVME_MPATH)
> > > > +	if ((rq->cmd_flags & REQ_NVME_MPATH) && (!nvme_req(rq)->retries))
> > > >  		nvme_mpath_start_request(rq);
> > > >  	blk_mq_start_request(rq);
> > > >  }
> > > 
> > > Using "retries" is bit indirect as a proxy for multipath active counts.
> > > Could this be moved to the mpath start instead, directly using the flag
> > > that accounts for the path? This also helps to keep track if the command
> > > gets retried across a user toggling the policy to "qd".
> > > 
> > > ---
> > > diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> > > index 3da980dc60d91..1c630967ddd40 100644
> > > --- a/drivers/nvme/host/multipath.c
> > > +++ b/drivers/nvme/host/multipath.c
> > > @@ -182,7 +182,8 @@ void nvme_mpath_start_request(struct request *rq)
> > >         struct nvme_ns *ns = rq->q->queuedata;
> > >         struct gendisk *disk = ns->head->disk;
> > > 
> > > -       if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) {
> > > +       if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD &&
> > > +           !(nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE)) {
> > >                 atomic_inc(&ns->ctrl->nr_active);
> > >                 nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE;
> > >         }
> > > --
> > 
> > 193         nvme_req(rq)->flags |= NVME_MPATH_IO_STATS;
> > 194         nvme_req(rq)->start_time = bdev_start_io_acct(disk->part0, req_op(rq),
> > 195                                                       jiffies);
> > 
> > Doing it this way might messup with stats accounting because the two
> > lines above will be executed on request retry. I do not think we need
> > that, right?
> 
> Yeah, but we can use the other flag to know if it's already been
> accounted:
> 
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -182,12 +182,14 @@ void nvme_mpath_start_request(struct request *rq)
>         struct nvme_ns *ns = rq->q->queuedata;
>         struct gendisk *disk = ns->head->disk;
> 
> -       if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD) {
> +       if (READ_ONCE(ns->head->subsys->iopolicy) == NVME_IOPOLICY_QD &&
> +           !(nvme_req(rq)->flags & NVME_MPATH_CNT_ACTIVE)) {
>                 atomic_inc(&ns->ctrl->nr_active);
>                 nvme_req(rq)->flags |= NVME_MPATH_CNT_ACTIVE;
>         }
> 
> -       if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq))
> +       if (!blk_queue_io_stat(disk->queue) || blk_rq_is_passthrough(rq) ||
> +           nvme_req(rq)->flags & NVME_MPATH_IO_STATS)
>                 return;
> 
>         nvme_req(rq)->flags |= NVME_MPATH_IO_STATS;

This works. However, I find Amit's change more straight forward to
understand. nvme_mpath_start_request()/nvme_mpath_end_request() are
called when request started/ended respectively. For a request that has
been retried on the same path nvme_mpath_start_request() need not be
called again. Such retry should be transparent to multipath layer.



More information about the Linux-nvme mailing list