arm64: kernel BUG at mm/page_alloc.c:1844!
Robert Richter
robert.richter at cavium.com
Wed Oct 5 07:13:13 PDT 2016
Hi all,
I am looking into a memory setup problem on ThunderX systems under
certain memory configurations. The symptom is
kernel BUG at mm/page_alloc.c:1848!
It happens for some configs with 64k page size enabled (at least since
4.5). The bug triggers for page zones with some pages in the zone not
assigned to this particular zone. In my case some pages that are
marked as nomap were not reassigned to the new zone of node 1, so
those are still assigned to node 0. Mark also reported something
similar for non-numa configs. I think this is a related problem where
not all pages of a zone have been reassigned to the new zone during
setup.
Now, the reason for the mis-configuration is a change in pfn_valid()
which reports pages marked nomap as invalid. In memmap_init_zone()
nomap-pages are not reassigned to that node (__init_single_pfn()).
I tried various changes to fix that, but without success so far:
a) I modified reserve_regions() to use memblock_reserve() instead of
memblock_mark_nomap(). This marked efi regions as reserved instead of
unmap. pfn_valid() now worked as before the nomap change. I could boot
the system but noticed the following malloc assertion which looks like
there is some mem corruption:
emacs: malloc.c:2395: sysmalloc: Assertion `(old_top == initial_top (av) && old_size == 0) || ((unsigned long) (old_size) >= MINSIZE && prev_inuse (old_top) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Other than that the system looked ok so far.
I checked pfn used by the process with kmem:mm_page_alloc_zone_locked,
it looked correct with all pfn allocated from free memory, mem ranges
reported by efi as reserved were not used.
b) I found a quote that for sparsemem the entire memmap (all pages have a
struct *page) for single section (include/linux/mmzone.h):
"In SPARSEMEM, it is assumed that a valid section has a memmap for
the entire section."
So I implemented a arm64 private __early_pfn_valid() function that
uses memblock_is_memory() to setup all pages of a zone. I got the same
result as for a).
c) I modified (almost) all arch arm64 users of pfn_valid() to use
memblock_mark_nomap() instead of pfn_valid() and changed pfn_valid()
to use memblock_is_memory(). Same problem as a).
d) Enabling HOLES_IN_ZONE config option does not looks correct for
sparsemem, trying it anyway causes VM_BUG_ON_PAGE() in in line 1849
since (uninitialized) struct *page is accessed. This did not work
either.
I also noticed the efi ranges to be only 4k page aligned, I checked
reserved mem regions, its transformation to 64k alignment looked ok
too. See below for efi and memblock address ranges.
Is there anything else I could do here? Am I missing something? Could
the malloc assertion be another bug which was uncovered after fixing
the first? Any help is appreciated.
Thanks,
-Robert
[ 0.000000] efi: Getting EFI parameters from FDT:
[ 0.000000] efi: System Table: 0x0000010fffffef18
[ 0.000000] efi: MemMap Address: 0x0000010ff788e018
[ 0.000000] efi: MemMap Size: 0x000006c0
[ 0.000000] efi: MemMap Desc. Size: 0x00000030
[ 0.000000] efi: MemMap Desc. Version: 0x00000001
[ 0.000000] efi: EFI v2.40 by Cavium Thunder cn88xx EFI ThunderX-Firmware-Release-1.22.10-0-g4e85766 Aug 24 2016 15:59:03
[ 0.000000] efi: ACPI=0xfffff000 ACPI 2.0=0xfffff014 SMBIOS 3.0=0x10ffafcf000
[ 0.000000] efi: Processing EFI memory map:
[ 0.000000] efi: 0x000001400000-0x00000147ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000001480000-0x0000024affff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x0000024b0000-0x0000211fffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000021200000-0x00002121ffff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000021220000-0x0000fffecfff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x0000fffed000-0x0000ffff4fff [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x0000ffff5000-0x0000ffff5fff [ACPI Memory NVS | | | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x0000ffff6000-0x0000ffffffff [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x000100000000-0x000ff7ffffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000ff8000000-0x000ff801ffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000ff8020000-0x000fffa9cfff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x000fffa9d000-0x000fffffffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010000400000-0x010f8465ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010f84660000-0x010f8568ffff [Loader Code | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010f85690000-0x010ff788afff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ff788b000-0x010ff788dfff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ff788e000-0x010ff7890fff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ff7891000-0x010ff78adfff [Loader Code | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ff78ae000-0x010ff9e97fff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ff9e98000-0x010ff9f20fff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x010ff9f21000-0x010ffaeb5fff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ffaeb6000-0x010ffafc8fff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x010ffafc9000-0x010ffafccfff [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x010ffafcd000-0x010ffaff4fff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x010ffaff5000-0x010ffb008fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ffb009000-0x010ffeb6cfff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ffeb6d000-0x010ffec94fff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ffec95000-0x010fffe28fff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010fffe29000-0x010fffe3ffff [Conventional Memory| | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010fffe40000-0x010fffe53fff [Loader Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010fffe54000-0x010ffffb8fff [Boot Code | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x010ffffb9000-0x010ffffccfff [Runtime Code |RUN| | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x010ffffcd000-0x010fffffefff [Runtime Data |RUN| | | | | | | |WB|WT|WC|UC]*
[ 0.000000] efi: 0x010ffffff000-0x010fffffffff [Boot Data | | | | | | | | |WB|WT|WC|UC]
[ 0.000000] efi: 0x804000001000-0x804000001fff [Memory Mapped I/O |RUN| | | | | | | | | | |UC]
[ 0.000000] efi: 0x87e0d0001000-0x87e0d0001fff [Memory Mapped I/O |RUN| | | | | | | | | | |UC]
[ 0.000000] MEMBLOCK configuration:
[ 0.000000] memory size = 0x1ffe800000 reserved size = 0x36506a21
[ 0.000000] memory.cnt = 0x9
[ 0.000000] memory[0x0] [0x00000001400000-0x000000fffdffff], 0xfebe0000 bytes on node 0 flags: 0x0
[ 0.000000] memory[0x1] [0x000000fffe0000-0x000000ffffffff], 0x20000 bytes on node 0 flags: 0x4
[ 0.000000] memory[0x2] [0x00000100000000-0x00000fffffffff], 0xf00000000 bytes on node 0 flags: 0x0
[ 0.000000] memory[0x3] [0x00010000400000-0x00010ff9e8ffff], 0xff9a90000 bytes on node 1 flags: 0x0
[ 0.000000] memory[0x4] [0x00010ff9e90000-0x00010ff9f2ffff], 0xa0000 bytes on node 1 flags: 0x4
[ 0.000000] memory[0x5] [0x00010ff9f30000-0x00010ffaeaffff], 0xf80000 bytes on node 1 flags: 0x0
[ 0.000000] memory[0x6] [0x00010ffaeb0000-0x00010ffaffffff], 0x150000 bytes on node 1 flags: 0x4
[ 0.000000] memory[0x7] [0x00010ffb000000-0x00010ffffaffff], 0x4fb0000 bytes on node 1 flags: 0x0
[ 0.000000] memory[0x8] [0x00010ffffb0000-0x00010fffffffff], 0x50000 bytes on node 1 flags: 0x4
[ 0.000000] reserved.cnt = 0xd
[ 0.000000] reserved[0x0] [0x00000001480000-0x0000000248ffff], 0x1010000 bytes flags: 0x0
[ 0.000000] reserved[0x1] [0x00000021200000-0x00000021210536], 0x10537 bytes flags: 0x0
[ 0.000000] reserved[0x2] [0x000000c0000000-0x000000dfffffff], 0x20000000 bytes flags: 0x0
[ 0.000000] reserved[0x3] [0x00000ffbfb8000-0x00000ffffdffff], 0x4028000 bytes flags: 0x0
[ 0.000000] reserved[0x4] [0x00000ffffecb00-0x00000fffffffff], 0x13500 bytes flags: 0x0
[ 0.000000] reserved[0x5] [0x00010f856a0000-0x00010f92a6ffff], 0xd3d0000 bytes flags: 0x0
[ 0.000000] reserved[0x6] [0x00010ff7880000-0x00010ff788ffff], 0x10000 bytes flags: 0x0
[ 0.000000] reserved[0x7] [0x00010ffbce0000-0x00010fffceffff], 0x4010000 bytes flags: 0x0
[ 0.000000] reserved[0x8] [0x00010fffee6d80-0x00010ffff2fffb], 0x4927c bytes flags: 0x0
[ 0.000000] reserved[0x9] [0x00010ffff30000-0x00010ffffa000f], 0x70010 bytes flags: 0x0
[ 0.000000] reserved[0xa] [0x00010ffffae280-0x00010ffffaff7f], 0x1d00 bytes flags: 0x0
[ 0.000000] reserved[0xb] [0x00010ffffaffa0-0x00010ffffaffce], 0x2f bytes flags: 0x0
[ 0.000000] reserved[0xc] [0x00010ffffaffd0-0x00010ffffafffe], 0x2f bytes flags: 0x0
More information about the linux-arm-kernel
mailing list