[PATCH 2/2] nvme: remove virtual boundary for sgl capable devices

Wed Aug 6 08:04:40 PDT 2025

On Wed, Aug 06, 2025 at 04:55:14PM +0200, Christoph Hellwig wrote:
> On Tue, Aug 05, 2025 at 12:56:08PM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch at kernel.org>
> > 
> > The nvme virtual boundary is only for the PRP format. Devices that can
> > use the SGL format don't need it for IO queues. Drop reporting it for
> > such PCIe devices; fabrics target will continue to use the limit.
> 
> That's not quite true any more as of 6.17.  We now also rely it for
> efficiently mapping multiple segments into a single IOMMU mapping.
> So by not enforcing it for IOMMU mode.  In many cases we're better
> off splitting I/O rather forcing a non-optimized IOMMU mapping.

Patch 1 removes the reliance on the virt boundary for the IOMMU. This
makes it possible for NVMe to use this optimization on ARM64 SMMU, which
we saw earlier can come in a larger granularity than NVMe's. Without
patch 1, NVMe could never use that optimization on such an architecture,
but now it can applications that choose to subscribe to that alignment.

This patch, though, is more about being able to utilize user space
buffers directly that can not be split into any valid IO's. This is
possible now with patch one not relying on the virt boundary for IOMMU
optimizations. In truth, for my use case, the IOMMU is either set to off
or passthrough, so that optimzation isn't reachable. The use case I'm
going for is taking zero-copy receive buffers from a network device and
directly using them for storage IO. The user data doesn't arrive in
nicely aligned segments from there.