[PATCH v2 1/2] arm64: vmemmap: use virtual projection of linear region
Ard Biesheuvel
ard.biesheuvel at linaro.org
Mon Mar 7 18:15:43 PST 2016
> On 8 mrt. 2016, at 08:07, David Daney <ddaney.cavm at gmail.com> wrote:
>
>> On 02/26/2016 08:57 AM, Ard Biesheuvel wrote:
>> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
>> some changes to the memory mapping code to allow physical memory to reside
>> at an offset that exceeds the size of the virtual mapping.
>>
>> However, since the size of the vmemmap area is proportional to the size of
>> the VA area, but it is populated relative to the physical space, we may
>> end up with the struct page array being mapped outside of the vmemmap
>> region. For instance, on my Seattle A0 box, I can see the following output
>> in the dmesg log.
>>
>> vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
>> 0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
>>
>> We can fix this by deciding that the vmemmap region is not a projection of
>> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
>> linear region. This way, we are guaranteed that the vmemmap region is of
>> sufficient size, and we can even reduce the size by half.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel at linaro.org>
>
> I see this commit now in Linus' kernel.org tree in v4.5-rc7.
>
> FYI: I am seeing a crash that goes away when I revert this. My kernel has some other modifications (our NUMA patches) so I haven't yet fully tracked this down on an unmodified kernel, but this is what I am getting:
>
Hi David,
You are the second one to report this issue on a 64k pages kernel, but i haven't managed to reproduce yet.
Any chance you could instrument vmemmap_populate_basepages to figure out whether the faulting address is populated, and if not, why?
Thanks,
Ard.
> .
> .
> [ 0.000000] Early memory node ranges
> [ 0.000000] node 0: [mem 0x0000000001400000-0x00000000fffeffff]
> [ 0.000000] node 0: [mem 0x00000000ffff0000-0x00000000ffffffff]
> [ 0.000000] node 0: [mem 0x0000000100000000-0x00000003f51cffff]
> [ 0.000000] node 0: [mem 0x00000003f51d0000-0x00000003f51dffff]
> [ 0.000000] node 0: [mem 0x00000003f51e0000-0x00000003fa9bffff]
> [ 0.000000] node 0: [mem 0x00000003fa9c0000-0x00000003faa8ffff]
> [ 0.000000] node 0: [mem 0x00000003faa90000-0x00000003ffa3ffff]
> [ 0.000000] node 0: [mem 0x00000003ffa40000-0x00000003ffa9ffff]
> [ 0.000000] node 0: [mem 0x00000003ffaa0000-0x00000003ffffffff]
> [ 0.000000] Initmem setup node 0 [mem 0x0000000001400000-0x00000003ffffffff]
> [ 0.000000] Unable to handle kernel paging request at virtual address fffffdff60ff0000
> [ 0.000000] pgd = fffffe0000e00000
> [ 0.000000] [fffffdff60ff0000] *pgd=00000003ffd90003, *pud=00000003ffd90003, *pmd=00000003ffd90003, *pte=0000000000000000
> [ 0.000000] Internal error: Oops: 96000007 [#1] PREEMPT SMP
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.0-rc7-numa+ #123
> [ 0.000000] Hardware name: Cavium ThunderX CN88XX board (DT)
> [ 0.000000] task: fffffe0000b39880 ti: fffffe0000b00000 task.ti: fffffe0000b00000
> [ 0.000000] PC is at memmap_init_zone+0xe0/0x130
> [ 0.000000] LR is at memmap_init_zone+0xc0/0x130
> [ 0.000000] pc : [<fffffe0000a85b28>] lr : [<fffffe0000a85b08>] pstate: 800002c5
> [ 0.000000] sp : fffffe0000b03cf0
> [ 0.000000] x29: fffffe0000b03cf0 x28: fffffe03febe1b80
> [ 0.000000] x27: fffffe03febe2a08 x26: fffffe0000b30000
> [ 0.000000] x25: 0000000000040000 x24: 0000000000000000
> [ 0.000000] x23: 1000000000000000 x22: 0000000000000000
> [ 0.000000] x21: 0000000000000001 x20: 00000000ffffffff
> [ 0.000000] x19: 000000000003fd40 x18: fffffe0000d7c240
> [ 0.000000] x17: 0000000000000009 x16: 0000000400000000
> [ 0.000000] x15: 0000000000000008 x14: 0000000000000004
> [ 0.000000] x13: 0000000000000000 x12: 000000000001c854
> [ 0.000000] x11: 00000003fffe37a8 x10: 0000000000000004
> [ 0.000000] x9 : 0000000000000000 x8 : fffffe03febc0000
> [ 0.000000] x7 : 0000000000000000 x6 : fffffe0000d7c240
> [ 0.000000] x5 : fffffdff60000000 x4 : 0000000000000007
> [ 0.000000] x3 : fffffdff60000000 x2 : fffffe0000d7c300
> [ 0.000000] x1 : 0000000000ff0000 x0 : 0000000000000001
> [ 0.000000]
> [ 0.000000] Process swapper (pid: 0, stack limit = 0xfffffe0000b00020)
> [ 0.000000] Stack: (0xfffffe0000b03cf0 to 0xfffffe0000b04000)
> [ 0.000000] 3ce0: fffffe0000b03d40 fffffe0000a85fd4
> [ 0.000000] 3d00: fffffe03febe2400 fffffe0000aa3000 0000000000000000 0000000000030000
> [ 0.000000] 3d20: fffffe0000b5eab4 fffffe0000b5eab8 0000000000000001 fffffe0000734d18
> [ 0.000000] 3d40: fffffe0000b03df0 fffffe0000a56928 0000000000000000 fffffe0000b32e68
> [ 0.000000] 3d60: 0000000000000004 fffffe0000b32e70 fffffe0000b30000 fffffe03febe1b80
> [ 0.000000] 3d80: fffffe0000c40000 0000000002200000 fffffe0000081198 00000003f51eaa0c
> [ 0.000000] 3da0: fffffe0000d7bc90 0000000000030000 fffffe000093f148 fffffe000093f008
> [ 0.000000] 3dc0: fffffe000093efb0 fffffe000093efd8 0000000000010000 fffffe0000af6798
> [ 0.000000] 3de0: 0000000000000140 0000000000040000 fffffe0000b03e90 fffffe0000a44e84
> [ 0.000000] 3e00: fffffe0000b03ec8 0000000001400000 0000000000040000 fffffe0000b30000
> [ 0.000000] 3e20: fffffe0000b30000 0000000001400000 fffffe0000c40000 0000000002200000
> [ 0.000000] 3e40: fffffe0000081198 00000003f51eaa0c 0000000000040000 0000000000000000
> [ 0.000000] 3e60: fffffe0000b30000 0000000001400000 ffffffff00c40000 00000000ffffffff
> [ 0.000000] 3e80: 000000000003ffaa 0000000000040000 fffffe0000b03ee0 fffffe0000a452d4
> [ 0.000000] 3ea0: fffffe03febd0000 fffffe0000b30b98 fffffe0000080000 fffffe0000a45168
> [ 0.000000] 3ec0: fffffe0000b03ee0 0000000000010000 0000000000040000 0000000000000000
> [ 0.000000] 3ee0: fffffe0000b03f00 fffffe0000a42f2c fffffdfffa800000 00000003ffaa0000
> [ 0.000000] 3f00: fffffe0000b03fa0 fffffe0000a40680 0000000000000000 fffffe0000b30b98
> [ 0.000000] 3f20: 0000000021200000 00000003f50de7c8 fffffe0000b30000 0000000001400000
> [ 0.000000] 3f40: 00000000021d0000 0000000002200000 fffffe0000081198 00000000ffffffc8
> [ 0.000000] 3f60: 00000003f50deb40 fffffe00007200a8 0000000000000001 0000000021200000
> [ 0.000000] 3f80: ffffffffffffffff 0000000000000000 0000000080808080 fefefefefefefefe
> [ 0.000000] 3fa0: 0000000000000000 fffffe00000811b4 00000003f50deb40 0000000000000e12
> [ 0.000000] 3fc0: 0000000021200000 00000003f50de7c8 00000003f50de7dd 0000000001400000
> [ 0.000000] 3fe0: 0000000000000000 fffffe0000a88a28 0000000000000000 0000000000000000
> [ 0.000000] Call trace:
> [ 0.000000] Exception stack(0xfffffe0000b03b30 to 0xfffffe0000b03c50)
> [ 0.000000] 3b20: 000000000003fd40 00000000ffffffff
> [ 0.000000] 3b40: fffffe0000b03cf0 fffffe0000a85b28 00000003fffba000 0000000000006000
> [ 0.000000] 3b60: 0000000000000004 0000000000000000 fffffe0000b03be0 fffffe00001d253c
> [ 0.000000] 3b80: 00000003fffba000 0000000000006000 fffffe0000a5abec 0000000000000080
> [ 0.000000] 3ba0: 0000000000000000 0000000000000000 0000000000010000 fffffe0000b674f0
> [ 0.000000] 3bc0: fffffe0000942288 00000003fffba000 0000000000000001 0000000000ff0000
> [ 0.000000] 3be0: fffffe0000d7c300 fffffdff60000000 0000000000000007 fffffdff60000000
> [ 0.000000] 3c00: fffffe0000d7c240 0000000000000000 fffffe03febc0000 0000000000000000
> [ 0.000000] 3c20: 0000000000000004 00000003fffe37a8 000000000001c854 0000000000000000
> [ 0.000000] 3c40: 0000000000000004 0000000000000008
> [ 0.000000] [<fffffe0000a85b28>] memmap_init_zone+0xe0/0x130
> [ 0.000000] [<fffffe0000a85fd4>] free_area_init_node+0x45c/0x4a4
> [ 0.000000] [<fffffe0000a56928>] free_area_init_nodes+0x594/0x5ec
> [ 0.000000] [<fffffe0000a44e84>] bootmem_init+0xc8/0xf8
> [ 0.000000] [<fffffe0000a452d4>] paging_init_rest+0x1c/0xdc
> [ 0.000000] [<fffffe0000a42f2c>] setup_arch+0x118/0x5a0
> [ 0.000000] [<fffffe0000a40680>] start_kernel+0xa8/0x3e0
> [ 0.000000] [<fffffe00000811b4>] 0xfffffe00000811b4
> [ 0.000000] Code: cb414261 f2dfbfe5 d37ae421 f2ffffe5 (f8656820)
> [ 0.000000] ---[ end trace cb88537fdc8fa200 ]---
> [ 0.000000] Kernel panic - not syncing: Fatal exception
> [ 0.000000] ---[ end Kernel panic - not syncing: Fatal exception
>
> .
> .
> .
>
>
>> ---
>> v2: simplify the expression for vmemmap, forward compatible with the patch that
>> changes the type of memstart_addr to s64
>>
>> arch/arm64/include/asm/pgtable.h | 7 ++++---
>> arch/arm64/mm/init.c | 4 ++--
>> 2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 16438dd8916a..43abcbc30813 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -34,18 +34,19 @@
>> /*
>> * VMALLOC and SPARSEMEM_VMEMMAP ranges.
>> *
>> - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
>> + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
>> * (rounded up to PUD_SIZE).
>> * VMALLOC_START: beginning of the kernel vmalloc space
>> * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
>> * fixed mappings and modules
>> */
>> -#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
>> +#define VMEMMAP_SIZE ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
>>
>> #define VMALLOC_START (MODULES_END)
>> #define VMALLOC_END (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
>>
>> -#define vmemmap ((struct page *)(VMALLOC_END + SZ_64K))
>> +#define VMEMMAP_START (VMALLOC_END + SZ_64K)
>> +#define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT))
>>
>> #define FIRST_USER_ADDRESS 0UL
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index e1f425fe5a81..4ea7efc28e65 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -380,8 +380,8 @@ void __init mem_init(void)
>> MLK_ROUNDUP(_text, _etext),
>> MLK_ROUNDUP(_sdata, _edata),
>> #ifdef CONFIG_SPARSEMEM_VMEMMAP
>> - MLG((unsigned long)vmemmap,
>> - (unsigned long)vmemmap + VMEMMAP_SIZE),
>> + MLG(VMEMMAP_START,
>> + VMEMMAP_START + VMEMMAP_SIZE),
>> MLM((unsigned long)phys_to_page(memblock_start_of_DRAM()),
>> (unsigned long)virt_to_page(high_memory)),
>> #endif
>
More information about the linux-arm-kernel
mailing list