[PATCH v10 1/8] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages

Logan Gunthorpe logang at deltatee.com
Fri Sep 23 12:08:31 PDT 2022



On 2022-09-23 12:13, Jason Gunthorpe wrote:
> On Thu, Sep 22, 2022 at 10:39:19AM -0600, Logan Gunthorpe wrote:
>> GUP Callers that expect PCI P2PDMA pages can now set FOLL_PCI_P2PDMA to
>> allow obtaining P2PDMA pages. If GUP is called without the flag and a
>> P2PDMA page is found, it will return an error.
>>
>> FOLL_PCI_P2PDMA cannot be set if FOLL_LONGTERM is set.
> 
> What is causing this? It is really troublesome, I would like to fix
> it. eg I would like to have P2PDMA pages in VFIO iommu page tables and
> in RDMA MR's - both require longterm.

You had said it was required if we were relying on unmap_mapping_range()...

https://lore.kernel.org/all/20210928200506.GX3544071@ziepe.ca/T/#u

> Is it just because ZONE_DEVICE was created for DAX and carried that
> revocable assumption over? Does anything in your series require
> revocable?

We still rely on unmap_mapping_range() indirectly in the unbind path.
So I expect if something takes a LONGERM mapping that would block until
whatever process holds the pin releases it. That's less than ideal and
I'm not sure what can be done about it.

>> @@ -2383,6 +2392,10 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
>>  		VM_BUG_ON(!pfn_valid(pte_pfn(pte)));
>>  		page = pte_page(pte);
>>  
>> +		if (unlikely(!(flags & FOLL_PCI_P2PDMA) &&
>> +			     is_pci_p2pdma_page(page)))
>> +			goto pte_unmap;
>> +
>>  		folio = try_grab_folio(page, 1, flags);
>>  		if (!folio)
>>  			goto pte_unmap;
> 
> On closer look this is not in the right place, we cannot touch the
> content of *page without holding a ref, and that doesn't happen until
> until try_grab_folio() completes.
> 
> It would be simpler to put this check in try_grab_folio/try_grab_page
> after the ref has been obtained. That will naturally cover all the
> places that need it.

Ok, I can make that change.

Logan





More information about the Linux-nvme mailing list