[PATCH] nvme: clamp max_hw_sectors based on DMA optimized limitation
Keith Busch
kbusch at kernel.org
Thu Apr 20 08:29:30 PDT 2023
On Thu, Apr 20, 2023 at 09:01:55PM +0800, Adrian Huang wrote:
> To fix the lock contention issue, clamp max_hw_sectors based on
> DMA optimized limitation in order to leverage scalable IOVA mechanism.
>
> Note: The issue does not happen with another NVME disk (mdts = 5
> and max_hw_sectors_kb = 128)
Thanks for the patch. I think this makes sense.
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 53ef028596c6..c0d1ea889b4d 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1819,11 +1819,16 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl,
> bool vwc = ctrl->vwc & NVME_CTRL_VWC_PRESENT;
>
> if (ctrl->max_hw_sectors) {
> - u32 max_segments =
> - (ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> 9)) + 1;
> + u32 opt_sectors, max_sectors; /* optimized/max sectors */
> + u32 max_segments;
> +
> + opt_sectors = dma_opt_mapping_size(ctrl->dev) >> SECTOR_SHIFT;
> + max_sectors = min_not_zero(ctrl->max_hw_sectors, opt_sectors);
> +
> + max_segments = (max_sectors / (NVME_CTRL_PAGE_SIZE >> 9)) + 1;
>
> max_segments = min_not_zero(max_segments, ctrl->max_segments);
> - blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors);
> + blk_queue_max_hw_sectors(q, max_sectors);
> blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX));
> }
> blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1);
Taking into account what Linus mentioned on a similiar patch[1], I think it may
make more sense for the lower level driver code to have already capped
ctrl->max_hw_sectors prior to calling this function. Something like the patch
below.
[1] https://lore.kernel.org/all/CAHk-=whogEk1UJfU3E7aW18PDYRbdAzXta5J0ECg=CB5=sCe7g@mail.gmail.com/
Side note, I think he's incorrect about using the max_segment_size limit since
the dma code will collapse physically contiguous segments, so splitting bvecs
for that limit won't really help.
---
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 282d808400c5b..8505fbeaa2d2f 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2914,6 +2914,12 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
struct nvme_dev *dev;
int ret = -ENOMEM;
+ /*
+ * Limit the max command size to prevent iod->sg allocations going
+ * over a single page.
+ */
+ size_t max_bytes = NVME_MAX_KB_SZ;
+
if (node == NUMA_NO_NODE)
set_dev_node(&pdev->dev, first_memory_node);
@@ -2955,12 +2961,9 @@ static struct nvme_dev *nvme_pci_alloc_dev(struct pci_dev *pdev,
dma_set_min_align_mask(&pdev->dev, NVME_CTRL_PAGE_SIZE - 1);
dma_set_max_seg_size(&pdev->dev, 0xffffffff);
- /*
- * Limit the max command size to prevent iod->sg allocations going
- * over a single page.
- */
- dev->ctrl.max_hw_sectors = min_t(u32,
- NVME_MAX_KB_SZ << 1, dma_max_mapping_size(&pdev->dev) >> 9);
+ max_bytes = min(max_bytes, dma_max_mapping_size(&pdev->dev));
+ max_bytes = min_not_zero(max_bytes, dma_opt_mapping_size(&pdev->dev));
+ dev->ctrl.max_hw_sectors = max_bytes >> 9;
dev->ctrl.max_segments = NVME_MAX_SEGS;
/*
--
More information about the Linux-nvme
mailing list