[PATCH] nvme-pci: fix host memory buffer allocation size
Christoph Hellwig
hch at lst.de
Tue May 10 00:03:56 PDT 2022
On Thu, Apr 28, 2022 at 06:09:11PM +0200, Thomas Weißschuh wrote:
> > > On my hardware we start with a chunk_size of 4MiB and just allocate
> > > 8 (hmmaxd) * 4 = 32 MiB which is worse than 1 * 200MiB.
> >
> > And that is because the hardware only has a limited set of descriptors.
>
> Wouldn't it make more sense then to allocate as much memory as possible for
> each descriptor that is available?
>
> The comment in nvme_alloc_host_mem() tries to "start big".
> But it actually starts with at most 4MiB.
Compared to what other operating systems offer, that is quite large.
> And on devices that have hmminds > 4MiB the loop condition will never succeed
> at all and HMB will not be used.
> My fairly boring hardware already is at a hmminds of 3.3MiB.
>
> > Is there any real problem you are fixing with this? Do you actually
> > see a performance difference on a relevant workload?
>
> I don't have a concrete problem or performance issue.
> During some debugging I stumbled in my kernel logs upon
> "nvme nvme0: allocated 32 MiB host memory buffer"
> and investigated why it was so low.
Until recently we could not even support these large sizes at all on
typical x86 configs. With my fairly recent change to allow vmap
remapped iommu allocations on x86 we can do that now. But if we
unconditionally enabled it I'd be a little worried about using too
much memory very easily.
We could look into removing the min with
PAGE_SIZE * MAX_ORDER_NR_PAGES to try to do larger segments for
"segment challenged" controllers now that it could work on a lot
of iommu enabled setups. But I'd rather have a very good reason for
that.
More information about the Linux-nvme
mailing list