[BUG] nvme/pci: nvme_map_cmb triggers WARN_ON in __add_pages when CMBSZ.SZ is zero
Kangfenglong
kangfenglong at huawei.com
Wed Jun 17 00:01:32 PDT 2026
Hi all,
We encountered a reproducible WARNING on an ARM64 platform (kernel 6.6) during NVMe CMB mapping:
Misaligned __add_pages start: 0x494a0000 end: 0x494a000f
WARNING: CPU: 0 PID: 10 at mm/memory_hotplug.c:396 __add_pages+0x13c/0x150
Call trace:
__add_pages+0x13c/0x150
arch_add_memory+0xbc/0x158
pagemap_range+0x184/0x410
memremap_pages+0x144/0x300
devm_memremap_pages+0x30/0x88
pci_p2pdma_add_resource+0x120/0x330
nvme_map_cmb+0x178/0x268 [nvme]
nvme_pci_enable+0x128/0x218 [nvme]
nvme_probe+0x124/0x358 [nvme]
Root cause analysis:
The NVMe controller (Shenzhen Unionmemory, device ID 1cc4:0123) reports CMBSZ register = 0x00000100, which decodes as:
- SZU (bits 11:8) = 1 → size unit = 64KB
- SZ (bits 31:12) = 0 → size value = 0
- SQS/CQS/RDS/WDS (bits 4:0) = 0 → no CMB capabilities advertised
1. In nvme_map_cmb(), nvme_cmb_size() extracts SZ=0, so size = nvme_cmb_size_unit() × nvme_cmb_size() = 0. After clamping against the BAR size, size remains 0.
2. The alignment check IS_ALIGNED(size, memremap_compat_align()) passes because IS_ALIGNED(0, X) is always true ― zero is a multiple of any alignment value.
3. Then pci_p2pdma_add_resource() is called with size=0. Inside that function, when size is zero, it replaces it with the full BAR length (pci_resource_len() - offset).
For this device (BAR0 = 64KB), this yields a 64KB range. This 64KB range is passed to devm_memremap_pages() → pagemap_range() → arch_add_memory() → __add_pages().
4. In __add_pages(), check_pfn_span() requires the range to be at least subsection-aligned (2MB on ARM64 with CONFIG_SPARSEMEM_VMEMMAP). A 64KB range does not meet this requirement, triggering the WARNING.
The core issue:
nvme_map_cmb() lacks a check for nvme_cmb_size() == 0. When SZ is zero, the controller effectively has no usable CMB, regardless of the SZU field.
Per the NVMe specification, a controller that does not support CMB should set CMBSZ to 0, but some controllers set a non-zero SZU while leaving SZ at zero, which triggers this path.
Additionally, IS_ALIGNED(0, X) returning true is a latent issue that masks zero-size inputs.
Proposed fix:
In nvme_map_cmb(), add an early return after the existing dev->cmbsz check:
if (!dev->cmbsz)
return;
+ if (!nvme_cmb_size(dev))
+ return;
This ensures that a CMB with zero size value is treated as no CMB, preventing the invalid memory range from reaching the memory hotplug path.
We have verified that this fix resolves the issue on our platform.
Would appreciate any feedback or suggestions for a more comprehensive approach.
Thanks!
More information about the Linux-nvme
mailing list