[PATCH] nvme-multipath: don't inherit LBA-related fields for the multipath node
Nilay Shroff
nilay at linux.ibm.com
Thu Mar 21 22:15:04 PDT 2024
On 3/22/24 02:38, Christoph Hellwig wrote:
> Linux 6.9 made the nvme multipath nodes not properly pick up changes when
> the LBA size goes smaller after an nvme format. This is because we now
> try to inherit the queue settings for the multipath node entirely from
> the individual paths. That is the right thing to do for I/O size
> limitations, which make up most of the queue limits, but it is wrong for
> changes to the namespace configuration, where we do want to pick up the
> new format, which will eventually show up on all paths once they are
> re-queried.
>
> Fix this by not inheriting the block size and related fields and always
> for updating them.
>
> Fixes: 8f03cfa117e0 ("nvme: don't use nvme_update_disk_info for the multipath disk")
> Reported-by: Nilay Shroff <nilay at linux.ibm.com>
> Signed-off-by: Christoph Hellwig <hch at lst.de>
> ---
> drivers/nvme/host/core.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 00864a63447099..4bac54d4e0015b 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2204,6 +2204,7 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
> }
>
> if (!ret && nvme_ns_head_multipath(ns->head)) {
> + struct queue_limits *ns_lim = &ns->disk->queue->limits;
> struct queue_limits lim;
>
> blk_mq_freeze_queue(ns->head->disk->queue);
> @@ -2215,7 +2216,26 @@ static int nvme_update_ns_info(struct nvme_ns *ns, struct nvme_ns_info *info)
> set_disk_ro(ns->head->disk, nvme_ns_is_readonly(ns, info));
> nvme_mpath_revalidate_paths(ns);
>
> + /*
> + * queue_limits mixes values that are the hardware limitations
> + * for bio splitting with what is the device configuration.
> + *
> + * For NVMe the device configuration can change after e.g. a
> + * Format command, and we really want to pick up the new format
> + * value here. But we must still stack the queue limits to the
> + * least common denominator for multipathing to split the bios
> + * properly.
> + *
> + * To work around this, we explicitly set the device
> + * configuration to those that we just queried, but only stack
> + * the splitting limits in to make sure we still obey possibly
> + * lower limitations of other controllers.
> + */
> lim = queue_limits_start_update(ns->head->disk->queue);
> + lim.logical_block_size = ns_lim->logical_block_size;
> + lim.physical_block_size = ns_lim->physical_block_size;
> + lim.io_min = ns_lim->io_min;
> + lim.io_opt = ns_lim->io_opt;
> queue_limits_stack_bdev(&lim, ns->disk->part0, 0,
> ns->head->disk->disk_name);
> ret = queue_limits_commit_update(ns->head->disk->queue, &lim);
I had tested the above patch from Christoph and it looks good.
Test results could be found here:
https://lore.kernel.org/all/239228ec-6c8d-432c-905d-b477014deee3@linux.ibm.com/
Thanks,
--Nilay
More information about the Linux-nvme
mailing list