[PATCH v3 1/2] mm: memblock Add some new functions to address the mem limit issue
Mark Rutland
mark.rutland at arm.com
Mon Jun 27 07:28:19 PDT 2016
On Mon, Jun 27, 2016 at 10:27:10AM +0800, Dennis Chen wrote:
> In some cases, memblock is queried to determine whether a physical
> address corresponds to memory present in a system even if unused by
> the OS for the linear mapping, highmem, etc. For example, the ACPI
> core needs this information to determine which attributes to use when
> mapping ACPI regions. Use of incorrect memory types can result in
> faults, data corruption, or other issues.
>
> Removing memory with memblock_enforce_memory_limit throws away this
> information, and so a kernel booted with 'mem=' may suffers from the
> issues described above. To avoid this, we need to keep those NOMAP
> regions instead of removing all above limit, which preserves the
> information we need while preventing other use of the regions.
>
> This patch adds new insfrastructure to retain all NOMAP memblock regions
> while removing others, to cater for this.
>
> At last, we add 'size' and 'flag' debug output in the memblock debugfs
> for ease of the memblock debug.
> The '/sys/kernel/debug/memblock/memory' output looks like before:
> 0: 0x0000008000000000..0x0000008001e7ffff
> 1: 0x0000008001e80000..0x00000083ff184fff
> 2: 0x00000083ff185000..0x00000083ff1c2fff
> 3: 0x00000083ff1c3000..0x00000083ff222fff
> 4: 0x00000083ff223000..0x00000083ffe42fff
> 5: 0x00000083ffe43000..0x00000083ffffffff
>
> After applied:
> 0: 0x0000008000000000..0x0000008001e7ffff 0x0000000001e80000 0x4
> 1: 0x0000008001e80000..0x00000083ff184fff 0x00000003fd305000 0x0
> 2: 0x00000083ff185000..0x00000083ff1c2fff 0x000000000003e000 0x4
> 3: 0x00000083ff1c3000..0x00000083ff222fff 0x0000000000060000 0x0
> 4: 0x00000083ff223000..0x00000083ffe42fff 0x0000000000c20000 0x4
> 5: 0x00000083ffe43000..0x00000083ffffffff 0x00000000001bd000 0x0
The debugfs changes should be a separate patch. Even if they're useful
for debugging this patch, they're logically independent.
> Signed-off-by: Dennis Chen <dennis.chen at arm.com>
> Cc: Catalin Marinas <catalin.marinas at arm.com>
> Cc: Steve Capper <steve.capper at arm.com>
> Cc: Ard Biesheuvel <ard.biesheuvel at linaro.org>
> Cc: Will Deacon <will.deacon at arm.com>
> Cc: Mark Rutland <mark.rutland at arm.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki at intel.com>
> Cc: Matt Fleming <matt at codeblueprint.co.uk>
> Cc: linux-mm at kvack.org
> Cc: linux-acpi at vger.kernel.org
> Cc: linux-efi at vger.kernel.org
> ---
> include/linux/memblock.h | 1 +
> mm/memblock.c | 55 +++++++++++++++++++++++++++++++++++++++++-------
> 2 files changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 6c14b61..2925da2 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -332,6 +332,7 @@ phys_addr_t memblock_mem_size(unsigned long limit_pfn);
> phys_addr_t memblock_start_of_DRAM(void);
> phys_addr_t memblock_end_of_DRAM(void);
> void memblock_enforce_memory_limit(phys_addr_t memory_limit);
> +void memblock_mem_limit_remove_map(phys_addr_t limit);
> bool memblock_is_memory(phys_addr_t addr);
> int memblock_is_map_memory(phys_addr_t addr);
> int memblock_is_region_memory(phys_addr_t base, phys_addr_t size);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index ca09915..8099f1a 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1465,14 +1465,11 @@ phys_addr_t __init_memblock memblock_end_of_DRAM(void)
> return (memblock.memory.regions[idx].base + memblock.memory.regions[idx].size);
> }
>
> -void __init memblock_enforce_memory_limit(phys_addr_t limit)
> +static phys_addr_t __init_memblock __find_max_addr(phys_addr_t limit)
> {
> phys_addr_t max_addr = (phys_addr_t)ULLONG_MAX;
> struct memblock_region *r;
>
> - if (!limit)
> - return;
> -
> /* find out max address */
> for_each_memblock(memory, r) {
> if (limit <= r->size) {
> @@ -1482,6 +1479,20 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit)
> limit -= r->size;
> }
>
> + return max_addr;
> +}
> +
> +void __init memblock_enforce_memory_limit(phys_addr_t limit)
> +{
> + phys_addr_t max_addr;
> +
> + if (!limit)
> + return;
> +
> + max_addr = __find_max_addr(limit);
> + if (max_addr == (phys_addr_t)ULLONG_MAX)
> + return;
We didn't previously return early, so do we actually need this check?
> +
> /* truncate both memory and reserved regions */
> memblock_remove_range(&memblock.memory, max_addr,
> (phys_addr_t)ULLONG_MAX);
> @@ -1489,6 +1500,32 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit)
> (phys_addr_t)ULLONG_MAX);
> }
>
> +void __init memblock_mem_limit_remove_map(phys_addr_t limit)
> +{
> + struct memblock_type *type = &memblock.memory;
> + phys_addr_t max_addr;
> + int i, ret, start_rgn, end_rgn;
> +
> + if (!limit)
> + return;
> +
> + max_addr = __find_max_addr(limit);
> + if (max_addr == (phys_addr_t)ULLONG_MAX)
> + return;
Likewise?
> +
> + ret = memblock_isolate_range(type, max_addr, (phys_addr_t)ULLONG_MAX,
> + &start_rgn, &end_rgn);
> + if (ret) {
> + WARN_ONCE(1, "Mem limit failed, will not be applied!\n");
> + return;
> + }
We don't have a similar warning in memblock_enforce_memory_limit, where
memblock_remove_range() might return an error code from an internal call
to memblock_isolate_range.
The two should be consistent, either both with a message or both
without.
> +
> + for (i = end_rgn - 1; i >= start_rgn; i--) {
> + if (!memblock_is_nomap(&type->regions[i]))
> + memblock_remove_region(type, i);
> + }
> +}
This will preserve nomap regions, but it does mean that we may preserve
less memory that the user asked for, since __find_max_addr counted nomap
(and reserved) regions. Given we've always counted the latter, maybe
that's ok.
We should clarify what __find_max_addr is intended to determine, with a
comment, so as to avoid future ambiguity there.
> +
> static int __init_memblock memblock_search(struct memblock_type *type, phys_addr_t addr)
> {
> unsigned int left = 0, right = type->cnt;
> @@ -1677,13 +1714,15 @@ static int memblock_debug_show(struct seq_file *m, void *private)
> reg = &type->regions[i];
> seq_printf(m, "%4d: ", i);
> if (sizeof(phys_addr_t) == 4)
> - seq_printf(m, "0x%08lx..0x%08lx\n",
> + seq_printf(m, "0x%08lx..0x%08lx 0x%08lx 0x%lx\n",
> (unsigned long)reg->base,
> - (unsigned long)(reg->base + reg->size - 1));
> + (unsigned long)(reg->base + reg->size - 1),
> + (unsigned long)reg->size, reg->flags);
> else
> - seq_printf(m, "0x%016llx..0x%016llx\n",
> + seq_printf(m, "0x%016llx..0x%016llx 0x%016llx 0x%lx\n",
> (unsigned long long)reg->base,
> - (unsigned long long)(reg->base + reg->size - 1));
> + (unsigned long long)(reg->base + reg->size - 1),
> + (unsigned long long)reg->size, reg->flags);
As mentioned above, this should be a separate patch. I have no strong
feelings either way about the logic itself.
Thanks,
Mark.
More information about the linux-arm-kernel
mailing list