[PATCH 3/3] nvme: freeze both the ns and multipath queues whe updating queue limits

Hannes Reinecke hare at suse.de
Mon Sep 2 03:36:42 PDT 2024


On 8/29/24 16:34, Hannes Reinecke wrote:
> On 8/29/24 08:31, Christoph Hellwig wrote:
>> We really should not have any I/O outstanding while updating the queue
>> limits, as that could introduce inconsistencies between the multipath
>> and underlying nodes.  So always freeze the multipath node first and then
>> the underlying one, matching the order in nvme_passthru_start.
>>
>> Signed-off-by: Christoph Hellwig <hch at lst.de>
>> ---
>>   drivers/nvme/host/core.c | 41 ++++++++++++++++++++++++++++------------
>>   1 file changed, 29 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 81d2a09dd00f30..0f01a0bf976fd3 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
[  .. ]
>> @@ -2254,7 +2272,9 @@ static int nvme_update_ns_info_block(struct 
>> nvme_ns *ns,
>>   done:
>>       set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
>>       set_bit(NVME_NS_READY, &ns->flags);
>> -    blk_mq_unfreeze_queue(ns->disk->queue);
>> +    if (nvme_ns_head_multipath(ns->head))
>> +        ret = nvme_update_ns_info_mpath(ns, info, unsupported);
>> +    nvme_ns_unfreeze(ns);
>>       if (blk_queue_is_zoned(ns->queue)) {
>>           ret = blk_revalidate_disk_zones(ns->disk);
> 
> While we're shuffling things around; doesn't the mean that we unfreeze 
> the queue (and allow I/O) before we've validated all zones?
> Wouldn't it be better if we validate zoned before unfreezing queues?
> 
> Otherwise it looks good; I'll give it a spin on my map/unmap testcases.
> 
Hmm. Turns out we're not quite there yet.
We call 'set_capacity_and_notify()' unconditionally a few lines up, 
before setting the namespace to 'READY'. And then we're doing another
call to 'set_capacity_and_notify()' (from nvme_update_ns_info_mpath()).
In my map/unmap tests I've seen a stall with udev, where the initial
scanning process it stuck in device_add_disk() trying to read the 
partition table (and waiting for the uptodate bit to clear), and the 
udev process closing the bdev and waiting on 'truncate_filemap_range').

With this patch:

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fdd67a55a477..8858e5bf49de 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2258,8 +2258,6 @@ static int nvme_update_ns_info_block(struct 
nvme_ns *ns,
                 goto out;
         }

-       set_capacity_and_notify(ns->disk, capacity);
-
         /*
          * Only set the DEAC bit if the device guarantees that reads from
          * deallocated data return zeroes.  While the DEAC bit does not
@@ -2274,6 +2272,8 @@ static int nvme_update_ns_info_block(struct 
nvme_ns *ns,
         set_bit(NVME_NS_READY, &ns->flags);
         if (nvme_ns_head_multipath(ns->head))
                 ret = nvme_update_ns_info_mpath(ns, info, unsupported);
+       else
+               set_capacity_and_notify(ns->disk, capacity);
         nvme_ns_unfreeze(ns);

         if (blk_queue_is_zoned(ns->queue)) {

on top of your patchset I no longer see the problem.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare at suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich




More information about the Linux-nvme mailing list