[PATCH v2] nvmet: force reconnect when number of queue changes

Wed Sep 28 09:01:57 PDT 2022

> So I guess it is valid to allow queue change without removing it from the
> port, but that does not change the fundamental question on DNR.
> If the host sees a DNR error on connect, my interpretation is that the host
> should not retry the connect command itself, but it shouldn't imply anything
> on tearing down the controller and giving up on it completely, forever.

I agree that DNR is command specific - the problem is when the host sees DNR on a Connect command.  If DNR means "Don't bother to retry the exact same command" - and the command is a Connect command - what does that mean - never try to connect to that NVM subsystem ever again?  That doesn't seem right all the time; there might be cases where it IS the right thing to do.

DNR makes sense for non-connect commands - doing the same thing again has a very high likelihood of failing in exactly the same way it did the first time.  Things like an invalid LBA value in a Read command - no matter how many times you retry - that LBA is still going to be invalid.  But, Connect has the possibility that it could be different - things can change.  Especially, if the device knows it is doing an update, and that it will return - WHY did the device set DNR? Maybe the device should not be setting DNR for this case.  Maybe the device could just hold the Connect command for a short time (say 5-10 seconds), and then FAIL the command (with DNR=0), then the host tries again.  The device just inserts a delay based on how long it might take to finish restarting (and the number of retries the host will do).

Should we talk about the CRD field?  That only works 1) if you have an IDENTIFY CONTROLLER structure to tell you the delay times, and 2) if the host enables it (via Host Behavior Enable).  We never really talk (in the spec) about doing that for a Connect kind of function - we assume there is already a valid connection to use that feature.  Using that (and I do mean writing spec text) would allow a Connect command to suggest a time limit for a retry.

	Fred

> -----Original Message-----
> From: Sagi Grimberg <sagi at grimberg.me>
> Sent: Wednesday, September 28, 2022 10:23 AM
> To: Daniel Wagner <dwagner at suse.de>; Knight, Frederick
> <Frederick.Knight at netapp.com>
> Cc: linux-nvme at lists.infradead.org; Shinichiro Kawasaki
> <shinichiro.kawasaki at wdc.com>; hare at suse.de
> Subject: Re: [PATCH v2] nvmet: force reconnect when number of queue
> changes
> 
> NetApp Security WARNING: This is an external email. Do not click links or
> open attachments unless you recognize the sender and know the content is
> safe.
> 
> 
> 
> 
> >> Would this be a case for a new AEN - controller configuration changed?
> >> I'm also wondering exactly what changed in the controller?  It can't
> >> be the "Number of Queues" feature (that can't change - The controller
> >> shall not change the value allocated between resets.).  Is it the
> >> MQES field in the CAP property that changes (queue size)?
> >>
> >> We already have change reporting for: Namespace attribute,
> >> Predictable Latency, LBA status, EG aggregate, Zone descriptor,
> >> Discovery Log, Reservations. We've questioned whether we need a
> >> Controller Attribute Changed.
> >>
> >> Would this be a case for an exception?  Does the DNR bit apply only
> >> to commands sent on queues that already exist (i.e., NOT the connect
> >> command since that command is actually creating the queue)?  FWIW - I
> >> don't like exceptions.
> >>
> >> Can you elaborate on exactly what is changing?
> >
> > The background story is, that we have a setup with two targets (*) and
> > the host is connected two both of them (HA setup). Both server run the
> > same software version. The host is told via Number of Queues (Feature
> > Identifier 07h) how many queues are supported (N queues).
> >
> > Now, a software upgrade is started which takes first one server
> > offline, updates it and brings it online again. Then the same process
> > with the second server.
> >
> > In the meantime the host tries to reconnect. Eventually, the reconnect
> > is successful but the Number of Queues (Feature Identifier 07h) has
> > changed to a smaller value, e.g N-2 queues.
> >
> > My test case here is trying to replicated this scenario but just with
> > one target. Hence the problem how to notify the host that it should
> > reconnect. As you mentioned this is not to supposed to change as long
> > a connection is established.
> >
> > My understanding is that the current nvme target implementation in
> > Linux doesn't really support this HA setup scenario hence my attempt
> > to get it flying with one target. The DNR bit comes into play because
> > I was toying with removing the subsystem from the port, changing the
> > number of queues and re-adding the subsystem to the port.
> >
> > This resulted in any request posted from the host seeing the DNR bit.
> > The second attempt here was to delete the controller to force a
> > reconnect. I agree it's also not really the right thing to do.
> >
> > As far I can tell, what's is missing from a testing point of view is
> > the ability to fail requests without the DNR bit set or the ability to
> > tell the host to reconnect. Obviously, an AEN would be nice for this
> > but I don't know if this is reason enough to extend the spec.
> 
> Looking into the code, its the connect that fails on invalid parameter with a
> DNR, because the host is attempting to connect to a subsystems that does
> not exist on the port (because it was taken offline for maintenance reasons).
> 
> So I guess it is valid to allow queue change without removing it from the
> port, but that does not change the fundamental question on DNR.
> If the host sees a DNR error on connect, my interpretation is that the host
> should not retry the connect command itself, but it shouldn't imply anything
> on tearing down the controller and giving up on it completely, forever.