[PATCH v2] arch_numa: Restore nid checks before registering a memblock with a node

Marc Zyngier maz at kernel.org
Sun Dec 1 11:49:44 PST 2024


Hi Mike,

On Sun, 01 Dec 2024 19:32:22 +0000,
Mike Rapoport <rppt at kernel.org> wrote:
> 
> Hi Marc,
> 
> On Sun, Dec 01, 2024 at 09:27:02AM +0000, Marc Zyngier wrote:
> > Commit 767507654c22 ("arch_numa: switch over to numa_memblks")
> > significantly cleaned up the NUMA registration code, but also
> > dropped a significant check that was refusing to accept to
> > configure a memblock with an invalid nid.
> 
> ... 
>  
> > while previous kernel versions were able to recognise how brain-damaged
> > the machine is, and only build a fake node.
> > 
> > Use the memblock_validate_numa_coverage() helper to restore some sanity
> > and a "working" system.
> > 
> > Fixes: 767507654c22 ("arch_numa: switch over to numa_memblks")
> > Suggested-by: Mike Rapoport <rppt at kernel.org>
> > Signed-off-by: Marc Zyngier <maz at kernel.org>
> > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > Cc: Will Deacon <will at kernel.org>
> > Cc: Zi Yan <ziy at nvidia.com>
> > Cc: Dan Williams <dan.j.williams at intel.com>
> > Cc: David Hildenbrand <david at redhat.com>
> > Cc: Andrew Morton <akpm at linux-foundation.org>
> > Cc: stable at vger.kernel.org
> > ---
> >  drivers/base/arch_numa.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> > index e187016764265..c63a72a1fed64 100644
> > --- a/drivers/base/arch_numa.c
> > +++ b/drivers/base/arch_numa.c
> > @@ -208,6 +208,10 @@ static int __init numa_register_nodes(void)
> >  {
> >  	int nid;
> >  
> > +	/* Check the validity of the memblock/node mapping */
> > +	if (!memblock_validate_numa_coverage(1))
> 
> I've changed this to memblock_validate_numa_coverage(0) and applied along
> with my patch that changed memblock_validate_numa_coverage() to work with
> 0:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock.git/log/?h=thunderx-fix
> 
> Can you please verify that it works on your "quality hardware"?

Commit 427c6179e159b in your tree still has memblock_validate_numa_coverage(1).
Forgot to push out the updated version?

Flipping this to 0 locally, I have verified that this still allows the
old thing to trudge along:

root at duodenum:~# uname -a
Linux duodenum 6.12.0-12115-g427c6179e159-dirty #3896 SMP PREEMPT Sun Dec  1 19:43:13 GMT 2024 aarch64

Thanks again,

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list