[PATCH 1/2] nvme: make NVMe freeze API reliably

Chao Leng lengchao at huawei.com
Tue Sep 6 02:32:01 PDT 2022



On 2022/9/6 16:45, Ming Lei wrote:
> On Thu, Aug 25, 2022 at 06:02:33PM +0800, Chao Leng wrote:
>>
>>
>> On 2022/8/21 16:47, Ming Lei wrote:
>>> From: Keith Busch <kbusch at kernel.org>
>>>
>>> In some corner cases[1], freeze wait and unfreeze API may be called on
>>> unfrozen queue, add one per-ns flag of NVME_NS_FREEZE to make these
>>> freeze APIs more reliably, then this kind of issues can be avoided.
>>> And similar approach has been applied on stopping/quiescing nvme queues.
>> This leads to another problem: the process that needs to be
>> in the frozen state is not actually frozen.
>> It's not safe.
> 
> The flag is just to control if queue wait is needed, blk_mq_freeze_queue_wait
> can be done only the flag is set. Not sure how it isn't safe.
I thought that the use of NVME_NS_FREEZE was the same as NVME_NS_STOPPED.
If just set_bit in nvme_start_freeze, it will cause another problem in
below scenario.
A: start freeze and set the bit;B:start freeze and set the bit;
and then
A:test and clear the bit, and unfreeze;B: test and skip;
The queue will be frozen for ever.

In addition, I think patch 2/2 can fix the bug well, patch 1/2 is not necessary.
No matter how to use NVME_NS_FREEZE , it may cause problems.
The freeze mechanism is perfect, and no additional protection mechanism is required.
> 
> Meantime calling blk_mq_freeze_queue_wait() on queue not being started
> to freeze is usually a bug, and I think WARN_ON_ONCE() can be added
> in nvme_wait_freeze().
> 
> 
> Thanks,
> Ming
> 
> .
> 



More information about the Linux-nvme mailing list