[PATCH V2 0/4] nvme: fix two kinds of IO hang from removing NSs

Thu Jun 22 06:51:12 PDT 2023

On Wed, Jun 21, 2023 at 09:48:49AM -0600, Keith Busch wrote:
> On Wed, Jun 21, 2023 at 09:27:31PM +0800, Ming Lei wrote:
> > On Wed, Jun 21, 2023 at 01:13:05PM +0300, Sagi Grimberg wrote:
> > > 
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > The 1st three patch fixes io hang when controller removal interrupts error
> > > > > > > > recovery, then queue is left as frozen.
> > > > > > > > 
> > > > > > > > The 4th patch fixes io hang when controller is left as unquiesce.
> > > > > > > 
> > > > > > > Ming, what happened to nvme-tcp/rdma move of freeze/unfreeze to the
> > > > > > > connect patches?
> > > > > > 
> > > > > > I'd suggest to handle all drivers(include nvme-pci) in same logic for avoiding
> > > > > > extra maintain burden wrt. error handling, but looks Keith worries about the
> > > > > > delay freezing may cause too many requests queued during error handling, and
> > > > > > that might cause user report.
> > > > > 
> > > > > For nvme-tcp/rdma your patch also addresses IO not failing over because
> > > > > they block on queue enter. So I definitely want this for fabrics.
> > > > 
> > > > The patch in the following link should fix these issues too:
> > > > 
> > > > https://lore.kernel.org/linux-block/ZJGmW7lEaipT6saa@ovpn-8-23.pek2.redhat.com/T/#u
> > > > 
> > > > I guess you still want the paired freeze patch because it makes freeze &
> > > > unfreeze more reliable in error handling. If yes, I can make one fabric
> > > > only change for you.
> > > 
> > > Not sure exactly what reliability is referred here.
> > 
> > freeze and unfreeze have to be called strictly in pair, but nvme calls
> > the two from different contexts, so unfreeze may easily be missed, and
> > causes mismatched freeze count. There has many such reports so far.
> > 
> > > I agree that there
> > > is an issue with controller delete during error recovery. The patch
> > > was a way to side-step it, great. But it addressed I/O blocked on enter
> > > and not failing over.
> > > 
> > > So yes, for fabrics we should have it. I would argue that it would be
> > > the right thing to do for pci as well. But I won't argue if Keith feels
> > > otherwise.
> > 
> > Keith, can you update with us if you are fine with moving
> > nvme_start_freeze() into nvme_reset_work() for nvme pci driver?
> 
> The point was to contain requests from entering while the hctx's are
> being reconfigured. If you're going to pair up the freezes as you've
> suggested, we might as well just not call freeze at all.

blk_mq_update_nr_hw_queues() requires queue to be frozen.

Thanks,
Ming