[PATCH] arm64: kdump: retain reserved memory regions

AKASHI Takahiro takahiro.akashi at linaro.org
Tue Jan 30 22:23:08 PST 2018


Ard, Bhupesh,

Thank you for the comments.
I will re-post a revised patch soon after running some tests.

But I'm still wondering whether my original approach[1] may be
useful in other (non-ACPI/efi) cases given that the current
memblock_cap_memory_range() has kinda flaw that any memory
reserved by firmware can be ignored at crash dump kernel.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-January/553098.html

Thanks,
-Takahiro AKASHI

On Wed, Jan 31, 2018 at 11:20:20AM +0530, Bhupesh Sharma wrote:
> Hi Ard, Akashi,
> 
> On Mon, Jan 29, 2018 at 5:41 PM, Ard Biesheuvel
> <ard.biesheuvel at linaro.org> wrote:
> > On 29 January 2018 at 08:12, AKASHI Takahiro <takahiro.akashi at linaro.org> wrote:
> >> James,
> >>
> >> On Fri, Jan 19, 2018 at 11:39:58AM +0000, James Morse wrote:
> >>> Hi Akashi,
> >>>
> >>> On 11/01/18 11:38, AKASHI Takahiro wrote:
> >>> > On Wed, Jan 10, 2018 at 11:26:55AM +0000, James Morse wrote:
> >>> >> On 10/01/18 10:09, AKASHI Takahiro wrote:
> >>> >>> This is a fix against the issue that crash dump kernel may hang up
> >>> >>> during booting, which can happen on any ACPI-based system with "ACPI
> >>> >>> Reclaim Memory."
> >>>
> >>> >>> (diagnosis)
> >>> >>> * This fault is a data abort, alignment fault (ESR=0x96000021)
> >>> >>>   during reading out ACPI table.
> >>> >>> * Initial ACPI tables are normally stored in system ram and marked as
> >>> >>>   "ACPI Reclaim memory" by the firmware.
> >>> >>> * After the commit f56ab9a5b73c ("efi/arm: Don't mark ACPI reclaim
> >>> >>>   memory as MEMBLOCK_NOMAP"), those regions' attribute were changed
> >>> >>>   removing NOMAP bit and they are instead "memblock-reserved".
> >>> >>> * When crash dump kernel boots up, it tries to accesses ACPI tables by
> >>> >>>   ioremap'ing them (through acpi_os_ioremap()).
> >>> >>> * Since those regions are not included in device tree's
> >>> >>>   "usable-memory-range" and so not recognized as part of crash dump
> >>> >>>   kernel's system ram, ioremap() will create a non-cacheable mapping here.
> >>> >>
> >>> >> Ugh, because acpi_os_ioremap() looks at the efi memory map through the prism of
> >>> >> what we pulled into memblock, which is different during kdump.
> >>> >>
> >>> >> Is an alternative to teach acpi_os_ioremap() to ask
> >>> >> efi_mem_attributes() directly for the attributes to use?
> >>> >> (e.g. arch_apei_get_mem_attribute())
> >>> >
> >>> > I didn't think of this approach.
> >>> > Do you mean a change like the patch below?
> >>>
> >>> Yes. Aha, you can pretty much re-use the helper directly.
> >>>
> >>> It was just a suggestion, removing the extra abstraction that is causing the bug
> >>> could be cleaner ...
> >>>
> >>> > (I'm still debugging this code since the kernel fails to boot.)
> >>>
> >>> ... but might be too fragile.
> >>>
> >>> There are points during boot when the EFI memory map isn't mapped.
> >>
> >> Right, this was a problem for my patch.
> >> Attached is the revised and workable one.
> >> Efi_memmap_init_late() may alternatively be called in acpi_early_init() or
> >> even in acpi_os_ioremap(), but either way it looks a bit odd.
> >>
> >
> > Akashi-san,
> >
> > efi_memmap_init_late() is currently being called from
> > arm_enable_runtime_services(), which is an early initcall. If that is
> > too late for acpi_early_init(), we could perhaps move the call
> > forward, i.e., sth like
> >
> > ---------8<------------
> > diff --git a/drivers/firmware/efi/arm-runtime.c
> > b/drivers/firmware/efi/arm-runtime.c
> > index 6f60d659b323..e835d3b20af6 100644
> > --- a/drivers/firmware/efi/arm-runtime.c
> > +++ b/drivers/firmware/efi/arm-runtime.c
> > @@ -117,7 +117,7 @@ static bool __init efi_virtmap_init(void)
> >   * non-early mapping of the UEFI system table and virtual mappings for all
> >   * EFI_MEMORY_RUNTIME regions.
> >   */
> > -static int __init arm_enable_runtime_services(void)
> > +void __init efi_enter_virtual_mode(void)
> >  {
> >         u64 mapsize;
> >
> > @@ -156,7 +156,6 @@ static int __init arm_enable_runtime_services(void)
> >
> >         return 0;
> >  }
> > -early_initcall(arm_enable_runtime_services);
> >
> >  void efi_virtmap_load(void)
> >  {
> > diff --git a/init/main.c b/init/main.c
> > index a8100b954839..2d0927768e2d 100644
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -674,6 +674,9 @@ asmlinkage __visible void __init start_kernel(void)
> >         debug_objects_mem_init();
> >         setup_per_cpu_pageset();
> >         numa_policy_init();
> > +       if (IS_ENABLED(CONFIG_EFI) &&
> > +           (IS_ENABLED(CONFIG_ARM64) || IS_ENABLED(CONFIG_ARM)))
> > +               efi_enter_virtual_mode();
> >         acpi_early_init();
> >         if (late_time_init)
> >                 late_time_init();
> > ---------8<------------
> >
> > would be reasonable imo. Also, I think it is justifiable to make ACPI
> > depend on UEFI on arm64, which is notably different from x86.
> >
> > (I know 'efi_enter_virtual_mode' is not entirely accurate here, given
> > that we call SetVirtualAddressMap from the UEFI stub on ARM, but it is
> > still close enough, given that one could argue that EFI is not in
> > 'virtual mode' until the mappings are in place)
> >
> >
> >
> >> ===8<===
> >> From c88f4c8106ba7a918c835b1cdf538b1d21019863 Mon Sep 17 00:00:00 2001
> >> From: AKASHI Takahiro <takahiro.akashi at linaro.org>
> >> Date: Mon, 29 Jan 2018 15:07:43 +0900
> >> Subject: [PATCH] arm64: kdump: make acpi_os_ioremap() more generic
> >>
> >> ---
> >>  arch/arm64/include/asm/acpi.h | 23 ++++++++++++++++-------
> >>  arch/arm64/kernel/acpi.c      |  7 ++-----
> >>  init/main.c                   |  4 ++++
> >>  3 files changed, 22 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> >> index 32f465a80e4e..d53c95f4e1a9 100644
> >> --- a/arch/arm64/include/asm/acpi.h
> >> +++ b/arch/arm64/include/asm/acpi.h
> >> @@ -12,10 +12,12 @@
> >>  #ifndef _ASM_ACPI_H
> >>  #define _ASM_ACPI_H
> >>
> >> +#include <linux/efi.h>
> >>  #include <linux/memblock.h>
> >>  #include <linux/psci.h>
> >>
> >>  #include <asm/cputype.h>
> >> +#include <asm/io.h>
> >>  #include <asm/smp_plat.h>
> >>  #include <asm/tlbflush.h>
> >>
> >> @@ -29,18 +31,22 @@
> >>
> >>  /* Basic configuration for ACPI */
> >>  #ifdef CONFIG_ACPI
> >> +pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
> >> +
> >>  /* ACPI table mapping after acpi_permanent_mmap is set */
> >>  static inline void __iomem *acpi_os_ioremap(acpi_physical_address phys,
> >>                                             acpi_size size)
> >>  {
> >> +       /* For normal memory we already have a cacheable mapping. */
> >> +       if (memblock_is_map_memory(phys))
> >> +               return (void __iomem *)__phys_to_virt(phys);
> >> +
> >>         /*
> >> -        * EFI's reserve_regions() call adds memory with the WB attribute
> >> -        * to memblock via early_init_dt_add_memory_arch().
> >> +        * We should still honor the memory's attribute here because
> >> +        * crash dump kernel possibly excludes some ACPI (reclaim)
> >> +        * regions from memblock list.
> >>          */
> >> -       if (!memblock_is_memory(phys))
> >> -               return ioremap(phys, size);
> >> -
> >> -       return ioremap_cache(phys, size);
> >> +       return __ioremap(phys, size, __acpi_get_mem_attribute(phys));
> >>  }
> >>  #define acpi_os_ioremap acpi_os_ioremap
> >>
> >> @@ -125,7 +131,10 @@ static inline const char *acpi_get_enable_method(int cpu)
> >>   * for compatibility.
> >>   */
> >>  #define acpi_disable_cmcff 1
> >> -pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr);
> >> +static inline pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
> >> +{
> >> +       return __acpi_get_mem_attribute(addr);
> >> +}
> >>  #endif /* CONFIG_ACPI_APEI */
> >>
> >>  #ifdef CONFIG_ACPI_NUMA
> >> diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
> >> index b3162715ed78..f94bdf7be439 100644
> >> --- a/arch/arm64/kernel/acpi.c
> >> +++ b/arch/arm64/kernel/acpi.c
> >> @@ -31,10 +31,9 @@
> >>  #include <asm/cpu_ops.h>
> >>  #include <asm/smp_plat.h>
> >>
> >> -#ifdef CONFIG_ACPI_APEI
> >> +/* CONFIG_ACPI_APEI */
> >>  # include <linux/efi.h>
> >>  # include <asm/pgtable.h>
> >> -#endif
> >>
> >>  int acpi_noirq = 1;            /* skip ACPI IRQ initialization */
> >>  int acpi_disabled = 1;
> >> @@ -239,8 +238,7 @@ void __init acpi_boot_table_init(void)
> >>         }
> >>  }
> >>
> >> -#ifdef CONFIG_ACPI_APEI
> >> -pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
> >> +pgprot_t __acpi_get_mem_attribute(phys_addr_t addr)
> >>  {
> >>         /*
> >>          * According to "Table 8 Map: EFI memory types to AArch64 memory
> >> @@ -261,4 +259,3 @@ pgprot_t arch_apei_get_mem_attribute(phys_addr_t addr)
> >>                 return __pgprot(PROT_NORMAL_NC);
> >>         return __pgprot(PROT_DEVICE_nGnRnE);
> >>  }
> >> -#endif
> >> diff --git a/init/main.c b/init/main.c
> >> index a8100b954839..a479ece2bae9 100644
> >> --- a/init/main.c
> >> +++ b/init/main.c
> >> @@ -674,6 +674,10 @@ asmlinkage __visible void __init start_kernel(void)
> >>         debug_objects_mem_init();
> >>         setup_per_cpu_pageset();
> >>         numa_policy_init();
> >> +#if defined(CONFIG_ARM64) && defined(CONFIG_EFI)
> >> +       efi_memmap_init_late(efi.memmap.phys_map,
> >> +                       efi.memmap.nr_map * efi.memmap.desc_size);
> >> +#endif
> >>         acpi_early_init();
> >>         if (late_time_init)
> >>                 late_time_init();
> >> --
> >> 2.15.1
> >>
> 
> I tested Ard's patch (on top of Akashi's proposed changes) on the
> huawei taishan machine (where I originally found the problem) and I
> can confirm that I am able to boot the kdump kernel properly and also
> save the crashcore dump on local disk.
> 
> Also as Ard mentioned, 'efi_enter_virtual_mode' is probably not the
> best name for the proposed function as we have already called
> 'SetVirtualAddressMap', but I cannot think of anything better. If
> there are other opinions we can consider the same, otherwise may be we
> can formalize this and queue it up as crashkernel is bricked on arm64
> machines which support acpi boot machines without the same (and
> several kdump users are affected because of the same).
> 
> Please feel free to add:
> 
> Tested-by: Bhupesh Sharma <bhsharma at redhat.com>
> 
> Regards,
> Bhupesh



More information about the kexec mailing list