[PATCH 2/2] nvme: don't freeze/unfreeze queues from different contexts

Tue Jun 13 17:38:43 PDT 2023

On Tue, Jun 13, 2023 at 08:41:46AM -0600, Keith Busch wrote:
> On Tue, Jun 13, 2023 at 08:58:47AM +0800, Ming Lei wrote:
> > And this way is correct because quiesce is enough for driver to handle
> > error recovery. The only difference is where to wait during error recovery.
> > With this way, IO is just queued in block layer queue instead of
> > __bio_queue_enter(), finally waiting for completion is done in upper
> > layer. Either way, IO can't move on during error recovery.
> 
> The point was to contain the fallout from modifying the hctx mappings.

blk_mq_update_nr_hw_queues() is called after nvme_wait_freeze
returns, nothing changes here, so correctness wrt. updating hctx
mapping is provided.

> If you allow IO to queue in the blk-mq layer while a reset is in
> progress, they may be entering a context that won't be as expected on
> the other side of the reset.

The only difference is that in-tree code starts to freeze
at the beginning of error recovery, which way can just prevent new IO,
and old ones still are queued, but can't be dispatched to driver
because of quiescing in both ways. With this patch, new IOs queued
after error recovery are just like old ones canceled before resetting.

So not see problems from driver side with this change, and nvme
driver has to cover new IOs queued after error happens.

Thanks.
Ming