[bug report] NVMe/IB: reset_controller need more than 1min

Max Gurtovoy mgurtovoy at nvidia.com
Sun Mar 20 08:11:29 PDT 2022


On 3/20/2022 3:03 PM, Sagi Grimberg wrote:
>
>>>> These are very long times for a non-debug kernel...
>>>> Max, do you see the root cause for this?
>>>>
>>>> Yi, does this happen with rxe/siw as well?
>>> Hi Sagi
>>>
>>> rxe/siw will take less than 1s
>>> with rdma_rxe
>>> # time nvme reset /dev/nvme0
>>> real 0m0.094s
>>> user 0m0.000s
>>> sys 0m0.006s
>>>
>>> with siw
>>> # time nvme reset /dev/nvme0
>>> real 0m0.097s
>>> user 0m0.000s
>>> sys 0m0.006s
>>>
>>> This is only reproducible with mlx IB card, as I mentioned before, the
>>> reset operation time changed from 3s to 12s after the below commit,
>>> could you check this commit?
>>>
>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
>>> Author: Max Gurtovoy <maxg at mellanox.com>
>>> Date:   Tue May 19 17:05:56 2020 +0300
>>>
>>>      nvme-rdma: add metadata/T10-PI support
>>>
>> I couldn't repro these long reset times.
>
> It appears to be when setting up a controller with lots of queues
> maybe?

I'm doing that.

>
>> Nevertheless, the above commit added T10-PI offloads.
>>
>> In this commit, for supported devices we create extra resources in HW 
>> (more memory keys per task).
>>
>> I suggested doing this configuration as part of the "nvme connect" 
>> command and save this resource allocation by default but during the 
>> review I was asked to make it the default behavior.
>
> Don't know if I gave you this feedback or not, but it probably didn't
> occur to the commenter that it will make the connection establishment
> take tens of seconds.

It was known that we create more resources than needed for 
"non-PI-desired" controllers.


>
>> Sagi/Christoph,
>>
>> WDYT ? should we reconsider the "nvme connect --with_metadata" option ?
>
> Maybe you can make these lazily allocated?

You mean something like:

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fd4720d37cc0..367ba0bb62ab 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1620,10 +1620,19 @@ int nvme_getgeo(struct block_device *bdev, 
struct hd_geometry *geo)
  }

  #ifdef CONFIG_BLK_DEV_INTEGRITY
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
-                               u32 max_integrity_segments)
+static int nvme_init_integrity(struct gendisk *disk, struct nvme_ns *ns)
  {
         struct blk_integrity integrity = { };
+       u16 ms = ns->ms;
+       u8 pi_type = ns->pi_type;
+       u32 max_integrity_segments = ns->ctrl->max_integrity_segments;
+       int ret;
+
+       if (ns->ctrl->ops->init_integrity) {
+               ret = ns->ctrl->ops->init_integrity(ns->ctrl);
+               if (ret)
+                       return ret;
+       }

         switch (pi_type) {
         case NVME_NS_DPS_PI_TYPE3:
@@ -1644,11 +1653,13 @@ static void nvme_init_integrity(struct gendisk 
*disk, u16 ms, u8 pi_type,
         integrity.tuple_size = ms;
         blk_integrity_register(disk, &integrity);
         blk_queue_max_integrity_segments(disk->queue, 
max_integrity_segments);
+
+       return 0;
  }
  #else
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
-                               u32 max_integrity_segments)
+static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns *ns)
  {
+       return 0;
  }
  #endif /* CONFIG_BLK_DEV_INTEGRITY */

@@ -1853,8 +1864,8 @@ static void nvme_update_disk_info(struct gendisk 
*disk,
         if (ns->ms) {
                 if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
                     (ns->features & NVME_NS_METADATA_SUPPORTED))
-                       nvme_init_integrity(disk, ns->ms, ns->pi_type,
- ns->ctrl->max_integrity_segments);
+                       if (nvme_init_integrity(disk, ns))
+                               capacity = 0;
                 else if (!nvme_ns_has_pi(ns))
                         capacity = 0;
         }
@@ -4395,7 +4406,7 @@ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);


and create the resources for the first namespace we find as PI formatted ?





More information about the Linux-nvme mailing list