[PATCH 1/2] nvme-tcp: fix controller reset hang during traffic
Sagi Grimberg
sagi at grimberg.me
Tue Jul 28 11:57:24 EDT 2020
>> commit fe35ec58f0d3 ("block: update hctx map when use multiple maps")
>> exposed an issue where we may hang trying to wait for queue freeze
>> during I/O. We call blk_mq_update_nr_hw_queues which in case of multiple
>> queue maps (which we have now for default/read/poll) is attempting to
>> freeze the queue. However we never started queue freeze when starting the
>> reset, which means that we have inflight pending requests that entered the
>> queue that we will not complete once the queue is quiesced.
>>
>> So start a freeze before we quiesce the queue, and unfreeze the queue
>> after we successfully connected the I/O queues (and make sure to call
>> blk_mq_update_nr_hw_queues only after we are sure that the queue was
>> already frozen).
>>
>> This follows to how the pci driver handles resets.
>>
>> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>
> Applied to nvme-5.9. I've also addeda fixes tag.
It doesn't fix a regression caused by this commit, but rather exposed
by this commit. I agree that its good to have it for stable.
Thanks
More information about the Linux-nvme
mailing list