[PATCH 2/3] block: Fix __blk_mq_update_nr_hw_queues() queue freeze and limits lock order
Damien Le Moal
dlemoal at kernel.org
Mon Jan 6 01:58:00 PST 2025
On 1/6/25 5:30 PM, Christoph Hellwig wrote:
> On Sat, Jan 04, 2025 at 10:25:21PM +0900, Damien Le Moal wrote:
>> __blk_mq_update_nr_hw_queues() freezes a device queues during operation,
>> which also includes updating the BLK_FEAT_POLL feature flag for the
>> device queues using queue_limits_start_update() and
>> queue_limits_commit_update(). This call order thus creates an invalid
>> ordering of a queue freeze and queue limit locking which can lead to a
>> deadlock when the device driver must issue commands to probe the device
>> when revalidating its limits.
>>
>> Avoid this issue by moving the update of the BLK_FEAT_POLL feature flag
>> out of the main queue remapping loop to the end of
>> __blk_mq_update_nr_hw_queues(), after the device queues have been
>> unfrozen.
>
> What happens if I/O is queued after the unfreeze, but before clearing
> the poll flag?
Ah, yes, that would potentially be an issue... Hmmm... Maybe a better solution
would be to move the start update out of the main loop and do it first, before
the freeze. What I do not fully understand with the code of this function is
that it does freeze and limit update for each tag list of the tag set, but
there is only a single request queue for all of them. So I am confused. Why
does the blk_mq_freeze_queue and poll limit setting have to be done in the loop
multiple times for each tag list ? I do not see it... If we can move these out
of the tag list loops, then correcting the ordering becomes easy.
--
Damien Le Moal
Western Digital Research
More information about the Linux-nvme
mailing list