[PATCH v3] memblock: make memblock_find_in_range method private

Mike Rapoport rppt at linux.ibm.com
Wed Aug 11 00:36:29 PDT 2021


On Tue, Aug 10, 2021 at 12:21:46PM -0700, Guenter Roeck wrote:
> On 8/10/21 11:55 AM, Mike Rapoport wrote:
> > On Mon, Aug 09, 2021 at 12:06:41PM -0700, Guenter Roeck wrote:
> > > On Tue, Aug 03, 2021 at 09:42:18AM +0300, Mike Rapoport wrote:
> > > > From: Mike Rapoport <rppt at linux.ibm.com>
> > > > 
> > > > There are a lot of uses of memblock_find_in_range() along with
> > > > memblock_reserve() from the times memblock allocation APIs did not exist.
> > > > 
> > > > memblock_find_in_range() is the very core of memblock allocations, so any
> > > > future changes to its internal behaviour would mandate updates of all the
> > > > users outside memblock.
> > > > 
> > > > Replace the calls to memblock_find_in_range() with an equivalent calls to
> > > > memblock_phys_alloc() and memblock_phys_alloc_range() and make
> > > > memblock_find_in_range() private method of memblock.
> > > > 
> > > > This simplifies the callers, ensures that (unlikely) errors in
> > > > memblock_reserve() are handled and improves maintainability of
> > > > memblock_find_in_range().
> > > > 
> > > > Signed-off-by: Mike Rapoport <rppt at linux.ibm.com>
> > > 
> > > I see a number of crashes in next-20210806 when booting x86 images from efi.
> > > 
> > > [    0.000000] efi: EFI v2.70 by EDK II
> > > [    0.000000] efi: SMBIOS=0x1fbcc000 ACPI=0x1fbfa000 ACPI 2.0=0x1fbfa014 MEMATTR=0x1f25f018
> > > [    0.000000] SMBIOS 2.8 present.
> > > [    0.000000] DMI: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> > > [    0.000000] last_pfn = 0x1ff50 max_arch_pfn = 0x400000000
> > > [    0.000000] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
> > > [    0.000000] Kernel panic - not syncing: alloc_low_pages: can not alloc memory
> > > [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.14.0-rc4-next-20210806 #1
> > > [    0.000000] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
> > > [    0.000000] Call Trace:
> > > [    0.000000]  ? dump_stack_lvl+0x57/0x7d
> > > [    0.000000]  ? panic+0xfc/0x2c6
> > > [    0.000000]  ? alloc_low_pages+0x117/0x156
> > > [    0.000000]  ? phys_pmd_init+0x234/0x342
> > > [    0.000000]  ? phys_pud_init+0x171/0x337
> > > [    0.000000]  ? __kernel_physical_mapping_init+0xec/0x276
> > > [    0.000000]  ? init_memory_mapping+0x1ea/0x2aa
> > > [    0.000000]  ? init_range_memory_mapping+0xdf/0x12e
> > > [    0.000000]  ? init_mem_mapping+0x1e9/0x26f
> > > [    0.000000]  ? setup_arch+0x5ff/0xb6d
> > > [    0.000000]  ? start_kernel+0x71/0x6b4
> > > [    0.000000]  ? secondary_startup_64_no_verify+0xc2/0xcb
> > > 
> > > Bisect points to this patch. Reverting it fixes the problem. Key seems to
> > > be the amount of memory configured in qemu; the problem is not seen if
> > > there is 1G or more of memory, but it is seen with all test boots with
> > > 512M or 256M of memory. It is also seen with almost all 32-bit efi boots.
> > > 
> > > The problem is not seen when booting without efi.
> > 
> > It looks like this change uncovered a problem in
> > x86::memory_map_top_down().
> > 
> > The allocation in alloc_low_pages() is limited by min_pfn_mapped and
> > max_pfn_mapped. The min_pfn_mapped is updated at every iteration of the
> > loop in memory_map_top_down, but there is another loop in
> > init_range_memory_mapping() that maps several regions below the current
> > min_pfn_mapped without updating this variable.
> > 
> > The memory layout in qemu with 256M of RAM and EFI enabled, causes
> > exhaustion of the memory limited by min_pfn_mapped and max_pfn_mapped
> > before min_pfn_mapped is updated.
> > 
> > Before this commit there was unconditional "reservation" of 2M in the end
> > of the memory that moved the initial min_pfn_mapped below the memory
> > reserved by EFI. The addition of check for xen_domain() removed this
> > reservation for !XEN and made alloc_low_pages() use the range already busy
> > with EFI data.
> > 
> > The patch below moves the update of min_pfn_mapped near the update of
> > max_pfn_mapped so that every time a new range is mapped both limits will be
> > updated accordingly.
> > 
> > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> > index 1152a29ce109..be279f6e5a0a 100644
> > --- a/arch/x86/mm/init.c
> > +++ b/arch/x86/mm/init.c
> > @@ -1,3 +1,4 @@
> > +#define DEBUG
> >   #include <linux/gfp.h>
> >   #include <linux/initrd.h>
> >   #include <linux/ioport.h>
> > @@ -485,6 +486,7 @@ static void add_pfn_range_mapped(unsigned long start_pfn, unsigned long end_pfn)
> >   	nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_MAX_ENTRIES);
> >   	max_pfn_mapped = max(max_pfn_mapped, end_pfn);
> > +	min_pfn_mapped = min(min_pfn_mapped, start_pfn);
> >   	if (start_pfn < (1UL<<(32-PAGE_SHIFT)))
> >   		max_low_pfn_mapped = max(max_low_pfn_mapped,
> > @@ -643,7 +645,6 @@ static void __init memory_map_top_down(unsigned long map_start,
> >   		mapped_ram_size += init_range_memory_mapping(start,
> >   							last_start);
> >   		last_start = start;
> > -		min_pfn_mapped = last_start >> PAGE_SHIFT;
> >   		if (mapped_ram_size >= step_size)
> >   			step_size = get_new_step_size(step_size);
> >   	}
> 
> The offending patch was removed from next-20210810, but I applied the above change
> to next-20210809 and it does indeed fix the problem. If it is added as separate patch,
> please feel free to add
> 
> Tested-by: Guenter Roeck <linux at roeck-us.net>

Thanks!

I wonder now about that comment saying "xen has big range in reserved near
end of ram". Maybe it was the same issue with Xen and we can entirely drop
the reservation of the top 2M?

x86 folks, what do you say?

-- 
Sincerely yours,
Mike.



More information about the linux-riscv mailing list