[PATCH v9 11/24] mm/hmm: provide generic DMA managing logic

Leon Romanovsky leon at kernel.org
Thu Apr 24 00:15:45 PDT 2025


On Wed, Apr 23, 2025 at 02:28:56PM -0300, Jason Gunthorpe wrote:
> On Wed, Apr 23, 2025 at 11:13:02AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro at nvidia.com>
> > 
> > HMM callers use PFN list to populate range while calling
> > to hmm_range_fault(), the conversion from PFN to DMA address
> > is done by the callers with help of another DMA list. However,
> > it is wasteful on any modern platform and by doing the right
> > logic, that DMA list can be avoided.
> > 
> > Provide generic logic to manage these lists and gave an interface
> > to map/unmap PFNs to DMA addresses, without requiring from the callers
> > to be an experts in DMA core API.
> > 
> > Tested-by: Jens Axboe <axboe at kernel.dk>
> 
> I don't think Jens tested the RDMA and hmm parts :)

I know, but he posted his Tested-by tag on cover letter and b4 picked it
for whole series. I decided to keep it as is.

> 
> > +	/*
> > +	 * The HMM API violates our normal DMA buffer ownership rules and can't
> > +	 * transfer buffer ownership.  The dma_addressing_limited() check is a
> > +	 * best approximation to ensure no swiotlb buffering happens.
> > +	 */
> 
> This is a bit unclear, HMM inherently can't do cache flushing or
> swiotlb bounce buffering because its entire purpose is to DMA directly
> and coherently to a mm_struct's page tables. There are no sensible
> points we could put the required flushing that wouldn't break the
> entire model.
> 
> FWIW I view that fact that we now fail back to userspace in these
> cases instead of quietly malfunction to be a big improvement.
> 
> > +bool hmm_dma_unmap_pfn(struct device *dev, struct hmm_dma_map *map, size_t idx)
> > +{
> > +	struct dma_iova_state *state = &map->state;
> > +	dma_addr_t *dma_addrs = map->dma_list;
> > +	unsigned long *pfns = map->pfn_list;
> > +	unsigned long attrs = 0;
> > +
> > +#define HMM_PFN_VALID_DMA (HMM_PFN_VALID | HMM_PFN_DMA_MAPPED)
> > +	if ((pfns[idx] & HMM_PFN_VALID_DMA) != HMM_PFN_VALID_DMA)
> > +		return false;
> > +#undef HMM_PFN_VALID_DMA
> 
> If a v10 comes I'd put this in a const function level variable:
> 
>           const unsigned int HMM_PFN_VALID_DMA = HMM_PFN_VALID | HMM_PFN_DMA_MAPPED;
> 
> Reviewed-by: Jason Gunthorpe <jgg at nvidia.com>

I have no idea if v10 is needed. DMA API is stable for a long time and
only DMA part should go in shared branch. Everything else will need to
go through relevant subsystems anyway.

Thanks

> 
> Jason



More information about the Linux-nvme mailing list