[PATCH 3/3] nvme: freeze both the ns and multipath queues whe updating queue limits
Hannes Reinecke
hare at suse.de
Mon Sep 2 03:36:42 PDT 2024
On 8/29/24 16:34, Hannes Reinecke wrote:
> On 8/29/24 08:31, Christoph Hellwig wrote:
>> We really should not have any I/O outstanding while updating the queue
>> limits, as that could introduce inconsistencies between the multipath
>> and underlying nodes. So always freeze the multipath node first and then
>> the underlying one, matching the order in nvme_passthru_start.
>>
>> Signed-off-by: Christoph Hellwig <hch at lst.de>
>> ---
>> drivers/nvme/host/core.c | 41 ++++++++++++++++++++++++++++------------
>> 1 file changed, 29 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 81d2a09dd00f30..0f01a0bf976fd3 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
[ .. ]
>> @@ -2254,7 +2272,9 @@ static int nvme_update_ns_info_block(struct
>> nvme_ns *ns,
>> done:
>> set_disk_ro(ns->disk, nvme_ns_is_readonly(ns, info));
>> set_bit(NVME_NS_READY, &ns->flags);
>> - blk_mq_unfreeze_queue(ns->disk->queue);
>> + if (nvme_ns_head_multipath(ns->head))
>> + ret = nvme_update_ns_info_mpath(ns, info, unsupported);
>> + nvme_ns_unfreeze(ns);
>> if (blk_queue_is_zoned(ns->queue)) {
>> ret = blk_revalidate_disk_zones(ns->disk);
>
> While we're shuffling things around; doesn't the mean that we unfreeze
> the queue (and allow I/O) before we've validated all zones?
> Wouldn't it be better if we validate zoned before unfreezing queues?
>
> Otherwise it looks good; I'll give it a spin on my map/unmap testcases.
>
Hmm. Turns out we're not quite there yet.
We call 'set_capacity_and_notify()' unconditionally a few lines up,
before setting the namespace to 'READY'. And then we're doing another
call to 'set_capacity_and_notify()' (from nvme_update_ns_info_mpath()).
In my map/unmap tests I've seen a stall with udev, where the initial
scanning process it stuck in device_add_disk() trying to read the
partition table (and waiting for the uptodate bit to clear), and the
udev process closing the bdev and waiting on 'truncate_filemap_range').
With this patch:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fdd67a55a477..8858e5bf49de 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2258,8 +2258,6 @@ static int nvme_update_ns_info_block(struct
nvme_ns *ns,
goto out;
}
- set_capacity_and_notify(ns->disk, capacity);
-
/*
* Only set the DEAC bit if the device guarantees that reads from
* deallocated data return zeroes. While the DEAC bit does not
@@ -2274,6 +2272,8 @@ static int nvme_update_ns_info_block(struct
nvme_ns *ns,
set_bit(NVME_NS_READY, &ns->flags);
if (nvme_ns_head_multipath(ns->head))
ret = nvme_update_ns_info_mpath(ns, info, unsupported);
+ else
+ set_capacity_and_notify(ns->disk, capacity);
nvme_ns_unfreeze(ns);
if (blk_queue_is_zoned(ns->queue)) {
on top of your patchset I no longer see the problem.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list