[PATCH v1 3/5] mm: ptdump: Provide page size to notepage()

Steven Price steven.price at arm.com
Fri Apr 16 17:00:22 BST 2021


On 16/04/2021 16:15, Christophe Leroy wrote:
> 
> 
> Le 16/04/2021 à 17:04, Christophe Leroy a écrit :
>>
>>
>> Le 16/04/2021 à 16:40, Christophe Leroy a écrit :
>>>
>>>
>>> Le 16/04/2021 à 15:00, Steven Price a écrit :
>>>> On 16/04/2021 12:08, Christophe Leroy wrote:
>>>>>
>>>>>
>>>>> Le 16/04/2021 à 12:51, Steven Price a écrit :
>>>>>> On 16/04/2021 11:38, Christophe Leroy wrote:
>>>>>>>
>>>>>>>
>>>>>>> Le 16/04/2021 à 11:28, Steven Price a écrit :
>>>>>>>> To be honest I don't fully understand why powerpc requires the 
>>>>>>>> page_size - it appears to be using it purely to find "holes" in 
>>>>>>>> the calls to note_page(), but I haven't worked out why such 
>>>>>>>> holes would occur.
>>>>>>>
>>>>>>> I was indeed introduced for KASAN. We have a first commit 
>>>>>>> https://github.com/torvalds/linux/commit/cabe8138 which uses page 
>>>>>>> size to detect whether it is a KASAN like stuff.
>>>>>>>
>>>>>>> Then came https://github.com/torvalds/linux/commit/b00ff6d8c as a 
>>>>>>> fix. I can't remember what the problem was exactly, something 
>>>>>>> around the use of hugepages for kernel memory, came as part of 
>>>>>>> the series 
>>>>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/cover/cover.1589866984.git.christophe.leroy@csgroup.eu/ 
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ah, that's useful context. So it looks like powerpc took a 
>>>>>> different route to reducing the KASAN output to x86.
>>>>>>
>>>>>> Given the generic ptdump code has handling for KASAN already it 
>>>>>> should be possible to drop that from the powerpc arch code, which 
>>>>>> I think means we don't actually need to provide page size to 
>>>>>> notepage(). Hopefully that means more code to delete ;)
>>>>>>
>>>>>
>>>>> Yes ... and no.
>>>>>
>>>>> It looks like the generic ptdump handles the case when several 
>>>>> pgdir entries points to the same kasan_early_shadow_pte. But it 
>>>>> doesn't take into account the powerpc case where we have regular 
>>>>> page tables where several (if not all) PTEs are pointing to the 
>>>>> kasan_early_shadow_page .
>>>>
>>>> I'm not sure I follow quite how powerpc is different here. But could 
>>>> you have a similar check for PTEs against kasan_early_shadow_pte as 
>>>> the other levels already have?
>>>>
>>>> I'm just worried that page_size isn't well defined in this interface 
>>>> and it's going to cause problems in the future.
>>>>
>>>
>>> I'm trying. I reverted the two commits b00ff6d8c and cabe8138.
>>>
>>> At the moment, I don't get exactly what I expect: For linear memory I 
>>> get one line for each 8M page whereas before reverting the patches I 
>>> got one 16M line and one 112M line.
>>>
>>> And for KASAN shadow area I get two lines for the 2x 8M pages 
>>> shadowing linear mem then I get one 4M line for each PGDIR entry 
>>> pointing to kasan_early_shadow_pte.
>>>
>>> 0xf8000000-0xf87fffff 0x07000000         8M   huge        rw       
>>> present
>>> 0xf8800000-0xf8ffffff 0x07800000         8M   huge        rw       
>>> present
>>> 0xf9000000-0xf93fffff 0x01430000         4M               r        
>>> present
>> ...
>>> 0xfec00000-0xfeffffff 0x01430000         4M               r        
>>> present
>>>
>>> Any idea ?
>>>
>>
>>
>> I think the different with other architectures is here:
>>
>>      } else if (flag != st->current_flags || level != st->level ||
>>             addr >= st->marker[1].start_address ||
>>             pa != st->last_pa + PAGE_SIZE) {
>>
>>
>> In addition to the checks everyone do, powerpc also checks "pa != 
>> st->last_pa + PAGE_SIZE".
>> And it is definitely for that test that page_size argument add been 
>> added.
> 
> By replacing that test by (pa - st->start_pa != addr - 
> st->start_address) it works again. So we definitely don't need the real 
> page size.

Yes that should work. Thanks for figuring it out!

> 
>>
>> I see that other architectures except RISCV don't dump the physical 
>> address. But even RISCV doesn't include that check.

Yes not having the physical address certainly simplifies things - 
although I can see why that can be handy to see. The disadvantage is 
that user space or vmalloc()'d memory will produce a lot of output 
because the physical addresses are unlikely to be contiguous. And for 
most uses you don't need the information.

>> That physical address dump was added by commit aaa229529244 
>> ("powerpc/mm: Add physical address to Linux page table dump") 
>> [https://github.com/torvalds/linux/commit/aaa2295]
>>
>> How do other architectures deal with the problem described by the 
>> commit log of that patch ?

AFAIK other architectures are "broken" in this regard. In practice I 
don't think it often causes an issue though.

Steve



More information about the linux-riscv mailing list