[PATCH 0/2] riscv: mm: Define DIRECT_MAP_PHYSMEM_END, fix ZONE_DEVICE

Vivian Wang wangruikang at iscas.ac.cn
Mon Mar 9 04:09:36 PDT 2026


With HSA_AMD_SVM=y, RISC-V runs into the same problem as arm64 at one
point did [1], where it tries to use a struct page that is outside of
vmemmap. See log near the end.

On RISC-V, the actual mappable range of physical addresses is dependent
on the current MMU mode i.e. satp_mode. Define DIRECT_MAP_PHYSMEM_END to
expose this information to get_free_mem_region().

See also commit eeb8fdfcf090 ("arm64: Expose the end of the linear map
in PHYSMEM_END") which fixed the same issue on arm64, although the
situation there is much less complicated.

Patch 1 copies a check in vmemmap_populate() over from arm64, which I
have done to debug this problem. Patch 2 is the actual fix.

[1] https://lore.kernel.org/all/20240903164532.3874988-1-scott@os.amperecomputing.com

Crash log:

[   19.228335] Oops [#1]
[   19.230607] Modules linked in: amdgpu(+) [ ... many more modules omitted ... ]
[   19.309895] CPU: 2 UID: 0 PID: 844 Comm: (udev-worker) Not tainted 6.19.3-ztest #3 PREEMPTLAZY
[   19.318587] Hardware name: Sophgo SG2044 SRD3-10 (DT)
[   19.323632] epc : __init_single_page+0x16/0x78
[   19.328079]  ra : __init_zone_device_page.constprop.0+0x28/0xd0
[   19.333997] epc : ffffffff802ff1be ra : ffffffff80d10de8 sp : ffff8f8002ef3450
[   19.341210]  gp : ffffffff82290cc8 tp : ffffaf808f526c80 t0 : ffffffff800231a8
[   19.348423]  t1 : 0000100000000000 t2 : 0000000000000002 s0 : ffff8f8002ef3460
[   19.355636]  s1 : 00038d7fe6000000 a0 : 00038d7fe6000000 a1 : 00000fffffa00000
[   19.362848]  a2 : 3000000000000000 a3 : 0000000000000000 a4 : 0000200000000000
[   19.370060]  a5 : 0000000000600000 a6 : ffffffff82305608 a7 : 0000000000000001
[   19.377273]  s2 : 00000fffffa00000 s3 : ffffaf8091258028 s4 : 0000000000000000
[   19.384485]  s5 : 0000100000000000 s6 : 0000000000000001 s7 : 00000fffffa00000
[   19.391697]  s8 : ffffffff81708b78 s9 : ffffffff81708b38 s10: 0000000000000001
[   19.398909]  s11: ffffaf8091258028 t3 : ffffffff822b43c0 t4 : 000000000207ffff
[   19.406121]  t5 : ffffffffffffffff t6 : 0000000000000000
[   19.411425] status: 0000000200000120 badaddr: 00038d7fe6000030 cause: 000000000000000f
[   19.419331] [<ffffffff802ff1be>] __init_single_page+0x16/0x78
[   19.425071] [<ffffffff80d10de8>] __init_zone_device_page.constprop.0+0x28/0xd0
[   19.432285] [<ffffffff80d11140>] memmap_init_zone_device+0x108/0x278
[   19.438632] [<ffffffff803f6a7a>] memremap_pages+0x262/0x6b0
[   19.444201] [<ffffffff803f6ef0>] devm_memremap_pages+0x28/0x78
[   19.450029] [<ffffffff053f9370>] kgd2kfd_init_zone_device+0xe0/0x1e0 [amdgpu]
[   19.483691] [<ffffffff059a6df0>] amdgpu_device_ip_init+0xa68/0xad0 [amdgpu]
[   19.517108] [<ffffffff05120f6e>] amdgpu_device_init+0x1a5e/0x21e0 [amdgpu]
[   19.550497] [<ffffffff05122e50>] amdgpu_driver_load_kms+0x20/0xc8 [amdgpu]
[   19.583900] [<ffffffff05116dd6>] amdgpu_pci_probe+0x236/0x6c8 [amdgpu]
[   19.616947] [<ffffffff807b2c10>] local_pci_probe+0x40/0x98
[   19.622439] [<ffffffff807b393c>] pci_device_probe+0xcc/0x280
[   19.628091] [<ffffffff8093527c>] really_probe+0xa4/0x3c0
[   19.633397] [<ffffffff80935614>] __driver_probe_device+0x7c/0x158
[   19.639483] [<ffffffff809357d8>] driver_probe_device+0x38/0xd0
[   19.645308] [<ffffffff80935a3a>] __driver_attach+0xaa/0x200
[   19.650873] [<ffffffff8093297e>] bus_for_each_dev+0x6e/0xc8
[   19.656443] [<ffffffff80934996>] driver_attach+0x26/0x38
[   19.661750] [<ffffffff809340e4>] bus_add_driver+0x11c/0x248
[   19.667317] [<ffffffff80936d1c>] driver_register+0x54/0x100
[   19.672884] [<ffffffff807b1f4c>] __pci_register_driver+0x54/0x68
[   19.678889] [<ffffffff0430e0a8>] amdgpu_init+0xa0/0xff8 [amdgpu]
[   19.711387] [<ffffffff80013a02>] do_one_initcall+0x62/0x2c8
[   19.716960] [<ffffffff80127ad6>] do_init_module+0x5e/0x260
[   19.722445] [<ffffffff80129d1e>] load_module+0x1bd6/0x2388
[   19.727925] [<ffffffff8012a730>] init_module_from_file+0xc8/0x128
[   19.734012] [<ffffffff8012a9d0>] __riscv_sys_finit_module+0x1e0/0x338
[   19.740447] [<ffffffff80d0d62a>] do_trap_ecall_u+0x102/0x3f8
[   19.746101] [<ffffffff80d1dbb4>] handle_exception+0x154/0x160
[   19.751852] Code: ffd2 0013 0000 1141 e022 e406 0800 8a0d 16fa 1672 (3823) 0205
[   19.759334] ---[ end trace 0000000000000000 ]---

---
Vivian Wang (2):
      riscv: mm: WARN_ON() for bad addresses in vmemmap_populate()
      riscv: mm: Define DIRECT_MAP_PHYSMEM_END

 arch/riscv/include/asm/pgtable.h | 10 ++++++++++
 arch/riscv/mm/init.c             |  2 ++
 2 files changed, 12 insertions(+)
---
base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
change-id: 20260309-riscv-sparsemem-vmemmap-limits-4734f1f67449

Best regards,
-- 
Vivian "dramforever" Wang




More information about the linux-riscv mailing list