[PATCH 06/19] mm/pagewalk: Check pfnmap early for folio_walk_start()
David Hildenbrand
david at redhat.com
Fri Aug 16 02:30:31 PDT 2024
On 14.08.24 15:05, Jason Gunthorpe wrote:
> On Fri, Aug 09, 2024 at 07:25:36PM +0200, David Hildenbrand wrote:
>
>>>> That is in general not what we want, and we still have some places that
>>>> wrongly hard-code that behavior.
>>>>
>>>> In a MAP_PRIVATE mapping you might have anon pages that we can happily walk.
>>>>
>>>> vm_normal_page() / vm_normal_page_pmd() [and as commented as a TODO,
>>>> vm_normal_page_pud()] should be able to identify PFN maps and reject them,
>>>> no?
>>>
>>> Yep, I think we can also rely on special bit.
>
> It is more than just relying on the special bit..
>
> VM_PFNMAP/VM_MIXEDMAP should really only be used inside
> vm_normal_page() because thay are, effectively, support for a limited
> emulation of the special bit on arches that don't have them. There are
> a bunch of weird rules that are used to try and make that work
> properly that have to be followed.
>
> On arches with the sepcial bit they should possibly never be checked
> since the special bit does everything you need.
>
> Arguably any place reading those flags out side of vm_normal_page/etc
> is suspect.
IIUC, your opinion matches mine: VM_PFNMAP/VM_MIXEDMAP and
pte_special()/... usage should be limited to
vm_normal_page/vm_normal_page_pmd/ ... of course, GUP-fast is special
(one of the reason for "pte_special()" and friends after all).
>
>>> Here I chose to follow gup-slow, and I suppose you meant that's also wrong?
>>
>> I assume just nobody really noticed, just like nobody noticed that
>> walk_page_test() skips VM_PFNMAP (but not VM_IO :) ).
>
> Like here..
>
>>> And, just curious: is there any use case you're aware of that can benefit
>>> from caring PRIVATE pfnmaps yet so far, especially in this path?
>>
>> In general MAP_PRIVATE pfnmaps is not really useful on things like MMIO.
>>
>> There was a discussion (in VM_PAT) some time ago whether we could remove
>> MAP_PRIVATE PFNMAPs completely [1]. At least some users still use COW
>> mappings on /dev/mem, although not many (and they might not actually write
>> to these areas).
>
> I've squashed many bugs where kernel drivers don't demand userspace
> use MAP_SHARED when asking for a PFNMAP, and of course userspace has
> gained the wrong flags too. I don't know if anyone needs this, but it
> has crept wrongly into the API.
>
> Maybe an interesting place to start is a warning printk about using an
> obsolete feature and see where things go from there??
Maybe we should start with some way to pr_warn_ONCE() whenever we get a
COW/unshare-fault in such a MAP_PRIVATE mapping, and essentially
populate the fresh anon folio.
Then we don't only know who mmaps() something like that, but who
actually relies on getting anon folios in there.
--
Cheers,
David / dhildenb
More information about the linux-arm-kernel
mailing list