[RFC PATCH 8/8] HACK: mm: memory_hotplug: Drop memblock_phys_free() call in try_remove_memory()

Mike Rapoport rppt at kernel.org
Mon Jun 3 00:57:28 PDT 2024


On Fri, May 31, 2024 at 09:49:32AM +0200, David Hildenbrand wrote:
> On 29.05.24 19:12, Jonathan Cameron wrote:
> > I'm not sure what this is balancing, but it if is necessary then the reserved
> > memblock approach can't be used to stash NUMA node assignments as after the
> > first add / remove cycle the entry is dropped so not available if memory is
> > re-added at the same HPA.
> > 
> > This patch is here to hopefully spur comments on what this is there for!
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron at huawei.com>
> > ---
> >   mm/memory_hotplug.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 431b1f6753c0..3d8dd4749dfc 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -2284,7 +2284,7 @@ static int __ref try_remove_memory(u64 start, u64 size)
> >   	}
> >   	if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {
> > -		memblock_phys_free(start, size);
> > +		//		memblock_phys_free(start, size);
> >   		memblock_remove(start, size);
> >   	}
> 
> memblock_phys_free() works on memblock.reserved, memblock_remove() works  on
> memblock.memory.
> 
> If you take a look at the doc at the top of memblock.c:
> 
> memblock.memory: physical memory available to the system
> memblock.reserved: regions that were allocated [during boot]
> 
> 
> memblock.memory is supposed to be a superset of memblock.reserved. Your

No it's not.
memblock.reserved is more of "if there is memory, don't touch it".
Some regions in memblock.reserved are boot time allocations and they are indeed a
subset of memblock.memory, but some are reservations done by firmware (e.g.
reserved memory in DT) that just might not have a corresponding regions in
memblock.memory. It can happen for example, when the same firmware runs on
devices with different memory configuration, but still wants to preserve
some physical addresses.

> "hack" here indicates that you somehow would be relying on the opposite
> being true, which indicates that you are doing the wrong thing.
 
I'm not sure about that, I still have to digest the patches :)
 
> memblock_remove() indeed balances against memblock_add_node() for hotplugged
> memory [add_memory_resource()]. There seem to a case where we would succeed
> in hotunplugging memory that was part of "memblock.reserved".
> 
> But how could that happen? I think the following way:
> 
> Once the buddy is up and running, memory allocated during early boot is not
> freed back to memblock, but usually we simply go via something like
> free_reserved_page(), not memblock_free() [because the buddy took over]. So
> one could end up unplugging memory that still resides in memblock.reserved
> set.
> 
> So with memblock_phys_free(), we are enforcing the invariant that
> memblock.memory is a superset of memblock.reserved.
> 
> Likely, arm64 should store that node assignment elsewhere from where it can
> be queried. Or it should be using something like
> CONFIG_HAVE_MEMBLOCK_PHYS_MAP for these static windows.
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.



More information about the linux-arm-kernel mailing list