[PATCH 1/2] nvme: make NVMe freeze API reliably

Ming Lei ming.lei at redhat.com
Tue Sep 6 17:33:14 PDT 2022


On Tue, Sep 06, 2022 at 05:32:01PM +0800, Chao Leng wrote:
> 
> 
> On 2022/9/6 16:45, Ming Lei wrote:
> > On Thu, Aug 25, 2022 at 06:02:33PM +0800, Chao Leng wrote:
> > > 
> > > 
> > > On 2022/8/21 16:47, Ming Lei wrote:
> > > > From: Keith Busch <kbusch at kernel.org>
> > > > 
> > > > In some corner cases[1], freeze wait and unfreeze API may be called on
> > > > unfrozen queue, add one per-ns flag of NVME_NS_FREEZE to make these
> > > > freeze APIs more reliably, then this kind of issues can be avoided.
> > > > And similar approach has been applied on stopping/quiescing nvme queues.
> > > This leads to another problem: the process that needs to be
> > > in the frozen state is not actually frozen.
> > > It's not safe.
> > 
> > The flag is just to control if queue wait is needed, blk_mq_freeze_queue_wait
> > can be done only the flag is set. Not sure how it isn't safe.
> I thought that the use of NVME_NS_FREEZE was the same as NVME_NS_STOPPED.
> If just set_bit in nvme_start_freeze, it will cause another problem in
> below scenario.
> A: start freeze and set the bit;B:start freeze and set the bit;
> and then
> A:test and clear the bit, and unfreeze;B: test and skip;
> The queue will be frozen for ever.

One simple approach is to replace down_read(->namespaces_rwsem) with
down_write(->namespaces_rwsem) in nvme_start_freeze() and
nvme_unfreeze().

> 
> In addition, I think patch 2/2 can fix the bug well, patch 1/2 is not necessary.
> No matter how to use NVME_NS_FREEZE , it may cause problems.
> The freeze mechanism is perfect, and no additional protection mechanism is required.

block layer requires queue freeze and unfreeze APIs to be called in
pair strictly, that is why I add the 1st patch.



Thanks,
Ming




More information about the Linux-nvme mailing list