[PATCH] nvme_core: scan namespaces asynchronously
stuart hayes
stuart.w.hayes at gmail.com
Fri Jan 12 11:36:51 PST 2024
>
>
> On 04/01/2024 18:47, Keith Busch wrote:
>> On Thu, Jan 04, 2024 at 10:38:26AM -0600, Stuart Hayes wrote:
>>> Currently NVME namespaces are scanned serially, so it can take a long time
>>> for all of a controller's namespaces to become available, especially with a
>>> slower (fabrics) interface with large number (~1000) of namespaces.
>>>
>>> Use async function calls to make namespace scanning happen in parallel,
>>> and add a (boolean) module parameter "async_ns_scan" to enable this.
>>
>> Hm, we're not doing a whole lot of blocking IO to bring up a namespace,
>> so I'm a little surprised it makes a noticable difference. How much time
>> improvement are you observing by parallelizing the scan? Is there a
>> tipping point in Number of Namespaces where inline scanning is better
>> than asynchronous? And if it is a meaningful gain, let's not introduce
>> another module parameter to disable it.
>
> I don't think it is a good idea since some of the namespace characteristics must be validated during re-connection time for example.
> I actually prepared a patch that makes sure we sync the ns scanning before kicking the ns blk queue to avoid that situations.
> for example, if for some reason ns1 change its uuid then we must remove it and open a new bdev instead. We can't kick old request to it...
>
Sorry for the delayed response--I thought I could get exact data on how long it takes with and
without the patch before I responded, it is taking a while (I'm having to rely on someone else
to do the testing). I'll respond with the data as soon as I get it--hopefully it won't be too
much longer. The time it takes to scan namespaces adds up when there are 1000 namespaces and
you have a fabrics controller on a network that isn't too fast.
I don't expect there would be any reason to disable this. I only put the module parameter to
disable it in case there was some unforeseen issue, but I can remove that.
To Max Gurtovoy--this patch wouldn't change when or how namespaces are validated... it just
puts the actual scan work function on a workqueue so the scans can happen in parallel. It will
do the same work to scan, at the same point, and it will wait for all the scanning to finish
before proceeding. I don't understand how this patch would make the situation you mention any
worse.
More information about the Linux-nvme
mailing list