[PATCH] arm: flush: don't abuse pfn_valid() to check if pfn is in RAM

Thu Feb 1 19:19:34 PST 2024

在 2024/2/1 17:00, Mike Rapoport 写道:
> Hi,
>
> Please don't top-post to Linux mailing lists.
>
> On Thu, Feb 01, 2024 at 04:00:04PM +0800, Yongqiang Liu wrote:
>> Very appreciate it for extra explanation. Notice that commit 024591f9a6e0
>>
>> ("arm: ioremap: don't abuse pfn_valid() to check if pfn is in RAM") use
>>
>> memblock_is_map_memory() instead of pfn_valid() to check if a PFN is in
>>
>> RAM or not, so I wrote the patch to solve this case.  Otherwise, when we
>>
>> use pageblock align(4M) address of memory or uio, like :
>>
>>       node   0: [mem 0x00000000c0c00000-0x00000000cc8fffff]
>>       node   0: [mem 0x00000000d0000000-0x00000000da1fffff]
>>
>> or uio address set like:
>>
>>     0xc0400000, 0x100000
>>
>> the pfn_valid will return false as memblock_is_map_memory.
> pfn_valid() should return false if and only if there is no struct page for
> that pfn.
>
> My understanding is that struct pages exist for the range of UIO addresses,
> and hopefully they have PG_reserved bit set, so a better fix IMO would be
> to check if the folio is !reserved.
>   

Thanks! All of the  UIO pages have PG_reserved bit set. I'm also 
confused about

whether other cases have the same issue like dma or someone using 
virt_addr_valid.

>> 在 2024/2/1 5:20, Robin Murphy 写道:
>>> On 2024-01-31 7:00 pm, Russell King (Oracle) wrote:
>>>> On Wed, Jan 31, 2024 at 06:39:31PM +0000, Robin Murphy wrote:
>>>>> On 31/01/2024 12:59 pm, Yongqiang Liu wrote:
>>>>>> @@ -292,7 +293,7 @@ void __sync_icache_dcache(pte_t pteval)
>>>>>>             /* only flush non-aliasing VIPT caches for exec mappings */
>>>>>>             return;
>>>>>>         pfn = pte_pfn(pteval);
>>>>>> -    if (!pfn_valid(pfn))
>>>>>> +    if (!memblock_is_map_memory(PFN_PHYS(pfn)))
>>>>>>             return;
>>>>>>         folio = page_folio(pfn_to_page(pfn));
>>>>> Hmm, it's a bit odd in context, since pfn_valid() obviously
>>>>> pairs with this
>>>>> pfn_to_page(), whereas it's not necessarily clear that
>>>>> memblock_is_map_memory() implies pfn_valid().
>>>>>
>>>>> However, in this case we're starting from a PTE - rather than
>>>>> going off to
>>>>> do a slow scan of memblock to determine whether a round-trip through
>>>>> page_address() is going to give back a mapped VA, can we not trivially
>>>>> identify that from whether the PTE itself is valid?
>>>> Depends what you mean by "valid". If you're referring to pte_valid()
>>>> and L_PTE_VALID then no.
>>>>
>>>> On 32-bit non-LPAE, the valid bit is the same as the present bit, and
>>>> needs to be set for the PTE to not fault. Any PTE that is mapping
>>>> something will be "valid" whether it is memory or not, whether it is
>>>> backed by a page or not.
>>>>
>>>> pfn_valid() should be telling us whether the PFN is suitable to be
>>>> passed to pfn_to_page(), and if we have a situation where pfn_valid()
>>>> returns true, but pfn_to_page() returns an invalid page, then that in
>>>> itself is a bug that needs to be fixed and probably has far reaching
>>>> implications for the stability of the kernel.
>>> Right, the problem here seems to be the opposite one, wherein we *do*
>>> often have a valid struct page for an address which is reserved and thus
>>> not mapped by the kernel, but seemingly we then take it down a path
>>> which assumes anything !PageHighmem() is lowmem and dereferences
>>> page_address() without looking.
>>>
>>> However I realise I should have looked closer at the caller, and my idea
>>> is futile since the PTE here is for a userspace mapping, not a kernel
>>> VA, and is already pte_valid_user() && !pte_special(). Plus the fact
>>> that the stack trace indicates an mmap() path suggests it most likely is
>>> a legitimate mapping of some no-map carveout or MMIO region. Oh well. My
>>> first point still stands, though - I think at least a comment to clarify
>>> that assumption would be warranted.
>>>
>>> Thanks,
>>> Robin.
>>> .