[PATCH] nvme: uring_cmd specific request_queue for SGLs
Christoph Hellwig
hch at lst.de
Mon Jun 30 23:16:09 PDT 2025
On Mon, Jun 30, 2025 at 08:04:47AM -0600, Keith Busch wrote:
> > My back on the envelope calculations (for 8 byte metadata chunks)
> > suggests otherwise, but I never got around fully benchmarking it.
> > If you do have a representation workload that you care about I'd love to
> > see the numbers.
>
> Metadata isn't actually the important part of this patch.
>
> The workload just receives data from a iouring zero-copy network and
> writes them out to disk using uring_cmd. The incoming data can have
> various offsets, so it often sends an iovec with page gaps.
>
> Currently the kernel provides a bounce buffer when there are page gaps.
> That's obviously undesirable when the hardware is capable of handling
> the original vector directly.
Yes, the bounce buffer is obviously not very efficient when transferring
large amount of data.
> The options to avoid the copies are either:
>
> a. Force the application to split each iovec into a separate command
>
> b. Relax the kernel's limits to match the hardware's capabilities
>
> This patch is trying to do "b".
a, or a variant of that (not using passthrough) would in general be
my preference. Why is that not suitable here?
More information about the Linux-nvme
mailing list