[PATCH v10 1/8] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages

Logan Gunthorpe logang at deltatee.com
Wed Sep 28 14:38:36 PDT 2022



On 2022-09-26 16:57, Jason Gunthorpe wrote:
> On Fri, Sep 23, 2022 at 05:51:49PM -0600, Logan Gunthorpe wrote:
>> Userspace code that's written for purpose can look at the EREMOTEIO error
>> and tell the user something useful, if we return the correct error.
>> If we return ENOMEM in this case, that is not possible because
>> lots of things might have caused that error.
> 
> That is reasonable, but I'd still prefer to see it done more
> centrally.
> 
> I mean the way the code is structured is at the top of the call chain
> the PIN/GET/0 is decided and then the callchain is run. All the
> callsites of try_grab_page() must be safe to call under FOLL_PIN
> because their caller is making the decision what flag to use.

Ok, so I've done some auditing here.

I've convinced myself it's safe to access the page before incrementing
the reference:

 * In the try_grab_page() case it must be safe as all call sites do seem
to be called under the appropriate ptl or mmap_lock (though this is hard
to audit). It's also true that it touches the page struct in the sense
of the reference.
 * In the try_grab_folio() case there already is already a similar
FOLL_LONGTERM check in that function *before* getting the reference and
the page should be stable due to the existing gup fast guarantees.

So we don't need to do the check after we have the reference and release
it when it fails. This simplifies things.

Moving the check into try_grab_x() should be possible with some cleanup.

For try_grab_page(), there are a few call sites that WARN_ON if it
fails, assuming it cannot fail seeing the page is stable.
try_grab_page() already has a WARN_ON on failure so it appears fine to
remove the second WARN_ON and add a new failure path that doesn't WARN.

For try_grab_folio() there's one call site in follow_hugetlb_page() that
assumes success and warns on failure; but this call site only applies to
hugetlb pages which should never be P2PDMA pages (nor non-longterm pages
which is another existing failure path). So I've added a note in the
comment with a couple other conditions that should not be possible.

I expect this work is way too late for the merge window now so I'll send
v11 after the window. In the meantime, if you want to do a quick review
on the first two patches, it would speed things up if there are obvious
changes. You can see these patches on this git branch:

  https://github.com/sbates130272/linux-p2pmem/  p2pdma_user_cmb_v11pre

Thanks,

Logan






More information about the Linux-nvme mailing list