[PATCH blktests v3] nvme/046: test queue count changes on reconnect

Hannes Reinecke hare at suse.de
Thu Sep 15 08:41:46 PDT 2022


On 9/15/22 14:50, Daniel Wagner wrote:
>> So do we agree the fc host should not stop reconnecting? James?
> 
> With the change below the test succeeds once:
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 42767fb75455..b167cf9fee6d 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3325,8 +3325,6 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
>                  dev_info(ctrl->ctrl.device,
>                          "NVME-FC{%d}: reset: Reconnect attempt failed (%d)\n",
>                          ctrl->cnum, status);
> -               if (status > 0 && (status & NVME_SC_DNR))
> -                       recon = false;
>          } else if (time_after_eq(jiffies, rport->dev_loss_end))
>                  recon = false;
> 
> This part was introduced with f25f8ef70ce2 ("nvme-fc: short-circuit
> reconnect retries"):
> 
>      nvme-fc: short-circuit reconnect retries
> 
>      Returning an nvme status from nvme_fc_create_association() indicates
>      that the association is established, and we should honour the DNR bit.
>      If it's set a reconnect attempt will just return the same error, so
>      we can short-circuit the reconnect attempts and fail the connection
>      directly.
> 
> It looks like the reasoning here didn't take into consideration the
> scenario we have here. I'd say we should not do it and handle it similar
> as with have with tcp/rdma.
> 
But that is wrong.
When we are evaluating the NVMe status we _need_ to check the DNR bit; 
that's precisely what it's for.

The real reason here is a queue inversion with fcloop; we've had them 
for ages, and I'm not surprised that it pops off now.
In fact, I was quite surprised that I didn't hit it when updating 
blktests; in previous versions I had to insert an explicit 'sleep 2' 
before disconnect to make it work.

I'd rather fix that than reverting FC to the (wrong) behaviour.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare at suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman



More information about the Linux-nvme mailing list