[RFC PATCH 8/8] HACK: mm: memory_hotplug: Drop memblock_phys_free() call in try_remove_memory()

David Hildenbrand david at redhat.com
Fri May 31 00:49:32 PDT 2024


On 29.05.24 19:12, Jonathan Cameron wrote:
> I'm not sure what this is balancing, but it if is necessary then the reserved
> memblock approach can't be used to stash NUMA node assignments as after the
> first add / remove cycle the entry is dropped so not available if memory is
> re-added at the same HPA.
> 
> This patch is here to hopefully spur comments on what this is there for!
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron at huawei.com>
> ---
>   mm/memory_hotplug.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 431b1f6753c0..3d8dd4749dfc 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -2284,7 +2284,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
>   	}
>   
>   	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
> -		memblock_phys_free(start, size);
> +		//		memblock_phys_free(start, size);
>   		memblock_remove(start, size);
>   	}

memblock_phys_free() works on memblock.reserved, memblock_remove() works 
  on memblock.memory.

If you take a look at the doc at the top of memblock.c:

memblock.memory: physical memory available to the system
memblock.reserved: regions that were allocated [during boot]


memblock.memory is supposed to be a superset of memblock.reserved. Your 
"hack" here indicates that you somehow would be relying on the opposite 
being true, which indicates that you are doing the wrong thing.


memblock_remove() indeed balances against memblock_add_node() for 
hotplugged memory [add_memory_resource()]. There seem to a case where we 
would succeed in hotunplugging memory that was part of "memblock.reserved".

But how could that happen? I think the following way:

Once the buddy is up and running, memory allocated during early boot is 
not freed back to memblock, but usually we simply go via something like 
free_reserved_page(), not memblock_free() [because the buddy took over]. 
So one could end up unplugging memory that still resides in 
memblock.reserved set.

So with memblock_phys_free(), we are enforcing the invariant that 
memblock.memory is a superset of memblock.reserved.

Likely, arm64 should store that node assignment elsewhere from where it 
can be queried. Or it should be using something like 
CONFIG_HAVE_MEMBLOCK_PHYS_MAP for these static windows.

-- 
Cheers,

David / dhildenb




More information about the linux-arm-kernel mailing list