[PATCH 1/2] nvme-tcp: fix controller reset hang during traffic
Christoph Hellwig
hch at lst.de
Tue Jul 28 06:49:59 EDT 2020
On Fri, Jul 24, 2020 at 03:10:12PM -0700, Sagi Grimberg wrote:
> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
> exposed an issue where we may hang trying to wait for queue freeze
> during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple
> queue maps (which we have now for default/read/poll) is attempting to
> freeze the queue. However we never started queue freeze when starting the
> reset, which means that we have inflight pending requests that entered the
> queue that we will not complete once the queue is quiesced.
>
> So start a freeze before we quiesce the queue, and unfreeze the queue
> after we successfully connected the I/O queues (and make sure to call
> blk_mq_update_nr_hw_queues only after we are sure that the queue was
> already frozen).
>
> This follows to how the pci driver handles resets.
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
Applied to nvme-5.9. I've also addeda fixes tag.
More information about the Linux-nvme
mailing list