[PATCH v2 06/19] mm/pagewalk: Check pfnmap for folio_walk_start()
Peter Xu
peterx at redhat.com
Thu Aug 29 11:45:16 PDT 2024
On Thu, Aug 29, 2024 at 08:35:49AM +0200, David Hildenbrand wrote:
> Fortunately, we did an excellent job at documenting vm_normal_page():
>
> * There are 2 broad cases. Firstly, an architecture may define a pte_special()
> * pte bit, in which case this function is trivial. Secondly, an architecture
> * may not have a spare pte bit, which requires a more complicated scheme,
> * described below.
> *
> * A raw VM_PFNMAP mapping (ie. one that is not COWed) is always considered a
> * special mapping (even if there are underlying and valid "struct pages").
> * COWed pages of a VM_PFNMAP are always normal.
> *
> * The way we recognize COWed pages within VM_PFNMAP mappings is through the
> * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
> * set, and the vm_pgoff will point to the first PFN mapped: thus every special
> * mapping will always honor the rule
> *
> * pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> PAGE_SHIFT)
> *
> * And for normal mappings this is false.
> *
>
> remap_pfn_range_notrack() will currently handle that for us:
>
> if (is_cow_mapping(vma->vm_flags)) {
> if (addr != vma->vm_start || end != vma->vm_end)
> return -EINVAL;
> }
>
> Even if [1] would succeed, the is_cow_mapping() check will return NULL and it will
> all work as expected, even without pte_special().
IMHO referencing vm_pgoff is ambiguous, and could be wrong, if without a
clear contract.
For example, consider when the driver setup a MAP_PRIVATE + VM_PFNMAP vma,
vm_pgoff to be not the "base PFN" but some random value, then for a COWed
page it's possible the calculation accidentally satisfies "pfn ==
vma->vm_pgoff + off". Then it could wrongly return NULL rather than the
COWed anonymous page here. This is extremely unlikely, but just to show
why it's wrong to reference it at all.
>
> Because VM_PFNMAP is easy: in a !COW mapping, everything is special.
Yes it's safe for vfio-pci, as vfio-pci doesn't have private mappings. But
still, I don't think it's clear enough now on how VM_PFNMAP should be
mapped.
--
Peter Xu
More information about the linux-arm-kernel
mailing list