[PATCH v1 3/5] mm: ptdump: Provide page size to notepage()
Steven Price
steven.price at arm.com
Fri Apr 16 17:00:22 BST 2021
On 16/04/2021 16:15, Christophe Leroy wrote:
>
>
> Le 16/04/2021 à 17:04, Christophe Leroy a écrit :
>>
>>
>> Le 16/04/2021 à 16:40, Christophe Leroy a écrit :
>>>
>>>
>>> Le 16/04/2021 à 15:00, Steven Price a écrit :
>>>> On 16/04/2021 12:08, Christophe Leroy wrote:
>>>>>
>>>>>
>>>>> Le 16/04/2021 à 12:51, Steven Price a écrit :
>>>>>> On 16/04/2021 11:38, Christophe Leroy wrote:
>>>>>>>
>>>>>>>
>>>>>>> Le 16/04/2021 à 11:28, Steven Price a écrit :
>>>>>>>> To be honest I don't fully understand why powerpc requires the
>>>>>>>> page_size - it appears to be using it purely to find "holes" in
>>>>>>>> the calls to note_page(), but I haven't worked out why such
>>>>>>>> holes would occur.
>>>>>>>
>>>>>>> I was indeed introduced for KASAN. We have a first commit
>>>>>>> https://github.com/torvalds/linux/commit/cabe8138 which uses page
>>>>>>> size to detect whether it is a KASAN like stuff.
>>>>>>>
>>>>>>> Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a
>>>>>>> fix. I can't remember what the problem was exactly, something
>>>>>>> around the use of hugepages for kernel memory, came as part of
>>>>>>> the series
>>>>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.leroy@csgroup.eu/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ah, that's useful context. So it looks like powerpc took a
>>>>>> different route to reducing the KASAN output to x86.
>>>>>>
>>>>>> Given the generic ptdump code has handling for KASAN already it
>>>>>> should be possible to drop that from the powerpc arch code, which
>>>>>> I think means we don't actually need to provide page size to
>>>>>> notepage(). Hopefully that means more code to delete ;)
>>>>>>
>>>>>
>>>>> Yes ... and no.
>>>>>
>>>>> It looks like the generic ptdump handles the case when several
>>>>> pgdir entries points to the same kasan_early_shadow_pte. But it
>>>>> doesn't take into account the powerpc case where we have regular
>>>>> page tables where several (if not all) PTEs are pointing to the
>>>>> kasan_early_shadow_page .
>>>>
>>>> I'm not sure I follow quite how powerpc is different here. But could
>>>> you have a similar check for PTEs against kasan_early_shadow_pte as
>>>> the other levels already have?
>>>>
>>>> I'm just worried that page_size isn't well defined in this interface
>>>> and it's going to cause problems in the future.
>>>>
>>>
>>> I'm trying. I reverted the two commits b00ff6d8c and cabe8138.
>>>
>>> At the moment, I don't get exactly what I expect: For linear memory I
>>> get one line for each 8M page whereas before reverting the patches I
>>> got one 16M line and one 112M line.
>>>
>>> And for KASAN shadow area I get two lines for the 2x 8M pages
>>> shadowing linear mem then I get one 4M line for each PGDIR entry
>>> pointing to kasan_early_shadow_pte.
>>>
>>> 0xf8000000-0xf87fffff 0x07000000 8M huge rw
>>> present
>>> 0xf8800000-0xf8ffffff 0x07800000 8M huge rw
>>> present
>>> 0xf9000000-0xf93fffff 0x01430000 4M r
>>> present
>> ...
>>> 0xfec00000-0xfeffffff 0x01430000 4M r
>>> present
>>>
>>> Any idea ?
>>>
>>
>>
>> I think the different with other architectures is here:
>>
>> } else if (flag != st->current_flags || level != st->level ||
>> addr >= st->marker[1].start_address ||
>> pa != st->last_pa + PAGE_SIZE) {
>>
>>
>> In addition to the checks everyone do, powerpc also checks "pa !=
>> st->last_pa + PAGE_SIZE".
>> And it is definitely for that test that page_size argument add been
>> added.
>
> By replacing that test by (pa - st->start_pa != addr -
> st->start_address) it works again. So we definitely don't need the real
> page size.
Yes that should work. Thanks for figuring it out!
>
>>
>> I see that other architectures except RISCV don't dump the physical
>> address. But even RISCV doesn't include that check.
Yes not having the physical address certainly simplifies things -
although I can see why that can be handy to see. The disadvantage is
that user space or vmalloc()'d memory will produce a lot of output
because the physical addresses are unlikely to be contiguous. And for
most uses you don't need the information.
>> That physical address dump was added by commit aaa229529244
>> ("powerpc/mm: Add physical address to Linux page table dump")
>> [https://github.com/torvalds/linux/commit/aaa2295]
>>
>> How do other architectures deal with the problem described by the
>> commit log of that patch ?
AFAIK other architectures are "broken" in this regard. In practice I
don't think it often causes an issue though.
Steve
More information about the linux-riscv
mailing list