[PATCH] arm64: Track no early_pgtable_alloc() for kmemleak

Thu Nov 4 10:06:49 PDT 2021

On Thu, Nov 04, 2021 at 11:56:23AM -0400, Qian Cai wrote:
> After switched page size from 64KB to 4KB on several arm64 servers here,
> kmemleak starts to run out of early memory pool due to a huge number of
> those early_pgtable_alloc() calls:
> 
>   kmemleak_alloc_phys()
>   memblock_alloc_range_nid()
>   memblock_phys_alloc_range()
>   early_pgtable_alloc()
>   init_pmd()
>   alloc_init_pud()
>   __create_pgd_mapping()
>   __map_memblock()
>   paging_init()
>   setup_arch()
>   start_kernel()
> 
> Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
> won't be enough for a server with 200GB+ memory. There isn't much
> interesting to check memory leaks for those early page tables and those
> early memory mappings should not reference to other memory. Hence, no
> kmemleak false positives, and we can safely skip tracking those early
> allocations from kmemleak like we did in the commit fed84c785270
> ("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
> introduce complications to automatically scale the value depends on the
> runtime memory size etc. After the patch, the default value of
> DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.
> 
> Signed-off-by: Qian Cai <quic_qiancai at quicinc.com>
> ---
>  arch/arm64/mm/mmu.c      |  3 ++-
>  include/linux/memblock.h |  1 +
>  mm/memblock.c            | 10 +++++++---
>  3 files changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index d77bf06d6a6d..4d3cfbaa92a7 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -96,7 +96,8 @@ static phys_addr_t __init early_pgtable_alloc(int shift)
>  	phys_addr_t phys;
>  	void *ptr;
>  
> -	phys = memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
> +	phys = memblock_phys_alloc_range(PAGE_SIZE, PAGE_SIZE, 0,
> +					 MEMBLOCK_ALLOC_PGTABLE);
>  	if (!phys)
>  		panic("Failed to allocate page table page\n");
>  
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 7df557b16c1e..de903055b01c 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -390,6 +390,7 @@ static inline int memblock_get_region_node(const struct memblock_region *r)
>  #define MEMBLOCK_ALLOC_ANYWHERE	(~(phys_addr_t)0)
>  #define MEMBLOCK_ALLOC_ACCESSIBLE	0
>  #define MEMBLOCK_ALLOC_KASAN		1
> +#define MEMBLOCK_ALLOC_PGTABLE		2
>  
>  /* We are using top down, so it is safe to use 0 here */
>  #define MEMBLOCK_LOW_LIMIT 0
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 659bf0ffb086..13bc56a641c0 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -287,7 +287,8 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
>  {
>  	/* pump up @end */
>  	if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
> -	    end == MEMBLOCK_ALLOC_KASAN)
> +	    end == MEMBLOCK_ALLOC_KASAN ||
> +	    end == MEMBLOCK_ALLOC_PGTABLE)

I think I'll be better to rename MEMBLOCK_ALLOC_KASAN to, say,
MEMBLOCK_ALLOC_NOKMEMLEAK and use that for both KASAN and page table cases.

But more generally, we are going to hit this again and again.
Couldn't we add a memblock allocation as a mean to get more memory to
kmemleak::mem_pool_alloc()?

>  		end = memblock.current_limit;
>  
>  	/* avoid allocating the first page */
> @@ -1387,8 +1388,11 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
>  	return 0;
>  
>  done:
> -	/* Skip kmemleak for kasan_init() due to high volume. */
> -	if (end != MEMBLOCK_ALLOC_KASAN)
> +	/*
> +	 * Skip kmemleak for kasan_init() and early_pgtable_alloc() due to high
> +	 * volume.
> +	 */
> +	if (end != MEMBLOCK_ALLOC_KASAN && end != MEMBLOCK_ALLOC_PGTABLE)
>  		/*
>  		 * The min_count is set to 0 so that memblock allocated
>  		 * blocks are never reported as leaks. This is because many
> -- 
> 2.30.2
> 

-- 
Sincerely yours,
Mike.