[PATCH 1/2] nvme-tcp: fix controller reset hang during traffic

Christoph Hellwig hch at lst.de
Tue Jul 28 06:49:59 EDT 2020


On Fri, Jul 24, 2020 at 03:10:12PM -0700, Sagi Grimberg wrote:
> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
> exposed an issue where we may hang trying to wait for queue freeze
> during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple
> queue maps (which we have now for default/read/poll) is attempting to
> freeze the queue. However we never started queue freeze when starting the
> reset, which means that we have inflight pending requests that entered the
> queue that we will not complete once the queue is quiesced.
> 
> So start a freeze before we quiesce the queue, and unfreeze the queue
> after we successfully connected the I/O queues (and make sure to call
> blk_mq_update_nr_hw_queues only after we are sure that the queue was
> already frozen).
> 
> This follows to how the pci driver handles resets.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>

Applied to nvme-5.9.  I've also addeda fixes tag.



More information about the Linux-nvme mailing list