[RFC RESEND 16/16] nvme-pci: use blk_rq_dma_map() for NVMe SGL

Wed Mar 6 08:14:38 PST 2024

On Wed, Mar 06, 2024 at 11:05:18AM -0400, Jason Gunthorpe wrote:
> > Yes.  And this whole proposal also seems clearly confused (not just
> > because of the gazillion reposts) but because it mixes up the case
> > where we can coalesce CPU regions into a single dma_addr_t range
> > (iommu and maybe in the future swiotlb) and one where we need a
> 
> I had the broad expectation that the DMA API user would already be
> providing a place to store the dma_addr_t as it has to feed that into
> the HW. That memory should simply last up until we do dma unmap and
> the cases that need dma_addr_t during unmap can go get it from there.

Well.  The dma_addr_t needs to be fed into the hardware somehow
obviously.  But for a the coalesced case we only need one such
field, not Nranges.

> We can't do much on the map side as single range doesn't imply
> contiguous range, P2P and alignment create discontinuities in the
> dma_addr_t that still have to be delt with.

For alignment the right answer is almost always to require the
upper layers to align to the iommu granularity.  We've been a bit
lax about that due to the way scatterlists are designed, but
requiring the proper alignment actually benefits everyone.