[PATCH 06/19] mm/pagewalk: Check pfnmap early for folio_walk_start()

David Hildenbrand david at redhat.com
Fri Aug 16 02:30:31 PDT 2024


On 14.08.24 15:05, Jason Gunthorpe wrote:
> On Fri, Aug 09, 2024 at 07:25:36PM +0200, David Hildenbrand wrote:
> 
>>>> That is in general not what we want, and we still have some places that
>>>> wrongly hard-code that behavior.
>>>>
>>>> In a MAP_PRIVATE mapping you might have anon pages that we can happily walk.
>>>>
>>>> vm_normal_page() / vm_normal_page_pmd() [and as commented as a TODO,
>>>> vm_normal_page_pud()] should be able to identify PFN maps and reject them,
>>>> no?
>>>
>>> Yep, I think we can also rely on special bit.
> 
> It is more than just relying on the special bit..
> 
> VM_PFNMAP/VM_MIXEDMAP should really only be used inside
> vm_normal_page() because thay are, effectively, support for a limited
> emulation of the special bit on arches that don't have them. There are
> a bunch of weird rules that are used to try and make that work
> properly that have to be followed.
> 
> On arches with the sepcial bit they should possibly never be checked
> since the special bit does everything you need.
> 
> Arguably any place reading those flags out side of vm_normal_page/etc
> is suspect.

IIUC, your opinion matches mine: VM_PFNMAP/VM_MIXEDMAP and 
pte_special()/... usage should be limited to 
vm_normal_page/vm_normal_page_pmd/ ... of course, GUP-fast is special 
(one of the reason for "pte_special()" and friends after all).

> 
>>> Here I chose to follow gup-slow, and I suppose you meant that's also wrong?
>>
>> I assume just nobody really noticed, just like nobody noticed that
>> walk_page_test() skips VM_PFNMAP (but not VM_IO :) ).
> 
> Like here..
> 
>>> And, just curious: is there any use case you're aware of that can benefit
>>> from caring PRIVATE pfnmaps yet so far, especially in this path?
>>
>> In general MAP_PRIVATE pfnmaps is not really useful on things like MMIO.
>>
>> There was a discussion (in VM_PAT) some time ago whether we could remove
>> MAP_PRIVATE PFNMAPs completely [1]. At least some users still use COW
>> mappings on /dev/mem, although not many (and they might not actually write
>> to these areas).
> 
> I've squashed many bugs where kernel drivers don't demand userspace
> use MAP_SHARED when asking for a PFNMAP, and of course userspace has
> gained the wrong flags too. I don't know if anyone needs this, but it
> has crept wrongly into the API.
> 
> Maybe an interesting place to start is a warning printk about using an
> obsolete feature and see where things go from there??

Maybe we should start with some way to pr_warn_ONCE() whenever we get a 
COW/unshare-fault in such a MAP_PRIVATE mapping, and essentially 
populate the fresh anon folio.

Then we don't only know who mmaps() something like that, but who 
actually relies on getting anon folios in there.

-- 
Cheers,

David / dhildenb




More information about the linux-arm-kernel mailing list