[PATCH] mm: cma: report correct node id

Kefeng Wang wangkefeng.wang at huawei.com
Sun Oct 29 21:22:38 PDT 2023



On 2023/10/26 0:37, Nathan Chancellor wrote:
> Hi Kefeng,
> 
> On Thu, Oct 19, 2023 at 09:32:53AM +0800, Kefeng Wang wrote:
>> Use early_pfn_to_nid() to get correct node id from base instead of
>> the default NUMA_NO_NODE in cma_declare_contiguous_nid().
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang at huawei.com>
>> ---
>>   mm/cma.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/cma.c b/mm/cma.c
>> index 2b2494fd6b59..97c27e5fe1a2 100644
>> --- a/mm/cma.c
>> +++ b/mm/cma.c
>> @@ -375,6 +375,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base,
>>   	if (ret)
>>   		goto free_mem;
>>   
>> +	if (nid == NUMA_NO_NODE)
>> +		nid = early_pfn_to_nid(PHYS_PFN(base));
>> +
>>   	pr_info("Reserved %ld MiB at %pa on node %d\n", (unsigned long)size / SZ_1M,
>>   		&base, nid);
>>   	return 0;
>> -- 
>> 2.27.0
>>
> 
> I bisected a RISC-V boot failure in QEMU to this change in -next. It
> happens with OpenSUSE's RISC-V configuration [1], which I was able to
> narrow down to the follow configurations on top of defconfig:

Hi Nathan, sorry for the late, I will check it, thanks.

> 
> CONFIG_ACPI_SPCR_TABLE=y
> CONFIG_CMA=y
> CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
> CONFIG_DMA_CMA=y
> CONFIG_NUMA=y
> 
> To easily reproduce:
> 
> $ echo 'CONFIG_ACPI_SPCR_TABLE=y
> CONFIG_CMA=y
> CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
> CONFIG_DMA_CMA=y
> CONFIG_NUMA=y' >kernel/configs/repro.config
> 
> $ make -skj"$(nproc)" ARCH=riscv CROSS_COMPILE=riscv64-linux- defconfig repro.config Image
> 
> $ qemu-system-riscv64 \
>      -display none \
>      -nodefaults \
>      -bios default \
>      -M virt \
>      -append earlycon \
>      -kernel arch/riscv/boot/Image \
>      -initrd riscv-rootfs.cpio \
>      -m 512m \
>      -serial mon:stdio
> 
> OpenSBI v1.3.1
> ...
> 
> <hangs after OpenSBI output>
> 
> Without CONFIG_ACPI_SPCR_TABLE=y, there is a visible crash.
> 
> [    0.000000] Linux version 6.6.0-rc7-next-20231025 (nathan at dev-fedora.c3-large-arm64) (riscv64-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41) #1 SMP Wed Oct 25 16:14:59 UTC 2023
> ...
> [    0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
> [    0.000000] page:ff1c000002200000 is uninitialized and poisoned
> [    0.000000] page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
> [    0.000000] ------------[ cut here ]------------
> [    0.000000] kernel BUG at include/linux/page-flags.h:493!
> [    0.000000] Kernel BUG [#1]
> [    0.000000] Modules linked in:
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc7-next-20231025 #1
> [    0.000000] Hardware name: riscv-virtio,qemu (DT)
> [    0.000000] epc : __free_pages_core+0x78/0x126
> [    0.000000]  ra : __free_pages_core+0x78/0x126
> [    0.000000] epc : ffffffff8018dd8e ra : ffffffff8018dd8e sp : ffffffff81403d40
> [    0.000000]  gp : ffffffff815013a0 tp : ffffffff8140db00 t0 : 6d75642065676170
> [    0.000000]  t1 : 0000000000000070 t2 : 706d756420656761 s0 : ffffffff81403d50
> [    0.000000]  s1 : 0000000000000004 a0 : 000000000000003c a1 : ffffffff814866a8
> [    0.000000]  a2 : 0000000000000000 a3 : 0000000000000001 a4 : 0000000000000000
> [    0.000000]  a5 : 0000000000000000 a6 : 0000000000000008 a7 : 0000000000000038
> [    0.000000]  s2 : 0000000000088000 s3 : ff1c000002200000 s4 : 0000000000000009
> [    0.000000]  s5 : 00000000ffffffff s6 : 0000000000081800 s7 : 0000000000088200
> [    0.000000]  s8 : 00000000000001c0 s9 : 0040000000000000 s10: ffffffff81500bdd
> [    0.000000]  s11: ffffffff81500bdc t3 : ffffffff81515aa7 t4 : ffffffff81515aa7
> [    0.000000]  t5 : ffffffff81515aa8 t6 : ffffffff81403b58
> [    0.000000] status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
> [    0.000000] [<ffffffff8018dd8e>] __free_pages_core+0x78/0x126
> [    0.000000] [<ffffffff80a12fee>] memblock_free_pages+0x52/0x62
> [    0.000000] [<ffffffff80a15f02>] memblock_free_all+0x1fc/0x27e
> [    0.000000] [<ffffffff80a061fa>] mem_init+0x34/0x22c
> [    0.000000] [<ffffffff80a13114>] mm_core_init+0x116/0x2d0
> [    0.000000] [<ffffffff80a00a6e>] start_kernel+0x3c6/0x742
> [    0.000000] Code: 0405 8399 8b85 d7f1 9597 00e2 8593 2ae5 90ef e5dd (9002) 6597
> [    0.000000] ---[ end trace 0000000000000000 ]---
> [    0.000000] Kernel panic - not syncing: Fatal exception in interrupt
> 
> The rootfs is available at [2] if necessary. If there is any more
> information I can provide or patches I can test, I am more than happy to
> do so.
> 
> [1]: https://github.com/openSUSE/kernel-source/raw/master/config/riscv64/default
> [2]: https://github.com/ClangBuiltLinux/boot-utils/releases
> 
> Cheers,
> Nathan



More information about the linux-riscv mailing list