[PATCH v3] nvme-pci: fix CMB mapping when CMBSZ Size field is zero
kangfenglong
kangfenglong at huawei.com
Sun Jun 21 19:41:29 PDT 2026
The controller memory buffer size is defined by the SZ field in the
CMBSZ register (bits 31:12). According to the NVMe specification, a
value of zero in the SZ field indicates that no CMB is present.
Commit f65efd6dfe4e ("nvme-pci: clean up CMB initialization") replaced
the check for a zero SZ field with a check for a zero CMBSZ register
value, under the assumption that a zero register implies no CMB.
However, a CMBSZ register can be non-zero while the SZ field is zero,
for example when Size Units (SZU) is set to a non-zero value but the
actual size is zero (e.g. CMBSZ = 0x100: SZU = 1, SZ = 0).
When this happens, nvme_map_cmb() proceeds to compute a size of zero,
passes the alignment checks (zero is always aligned), and calls
pci_p2pdma_add_resource() with size=0. The P2PDMA subsystem then
defaults size to the entire remaining BAR, which may not be properly
aligned for memory hotplug, triggering:
WARNING: CPU: 0 PID: 10 at mm/memory_hotplug.c:396 __add_pages
Misaligned __add_pages start: 0x494a0000 end: 0x494a000f
Fix this by:
1. Restoring the check for a zero SZ field, which explicitly verifies
that the controller actually has a CMB before proceeding.
2. Adding a boundary check in nvme_cmb_size_unit() to reject SZU values
greater than 6. The SZU field is 4 bits, but values 7-15 are reserved
by the NVMe specification and produce a shift count >= 40 in the
left-shift operation (1ULL << (12 + 4 * szu) when szu >= 7), which
exceeds the maximum 64 GB unit size defined by the spec.
3. Checking the return value of nvme_cmb_size_unit() for zero before
proceeding, and using check_mul_overflow() for the size calculation
to detect integer overflow that could wrap to a non-zero value and
bypass the SZ check.
Fixes: f65efd6dfe4e ("nvme-pci: clean up CMB initialization")
Signed-off-by: kangfenglong <kangfenglong at huawei.com>
---
drivers/nvme/host/pci.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8438c904ec49..ee2d782f4957 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2448,6 +2448,9 @@ static u64 nvme_cmb_size_unit(struct nvme_dev *dev)
{
u8 szu = (dev->cmbsz >> NVME_CMBSZ_SZU_SHIFT) & NVME_CMBSZ_SZU_MASK;
+ if (szu > 6)
+ return 0;
+
return 1ULL << (12 + 4 * szu);
}
@@ -2458,7 +2461,7 @@ static u32 nvme_cmb_size(struct nvme_dev *dev)
static void nvme_map_cmb(struct nvme_dev *dev)
{
- u64 size, offset;
+ u64 size, unit_size, offset;
resource_size_t bar_size;
struct pci_dev *pdev = to_pci_dev(dev->dev);
int bar;
@@ -2472,9 +2475,15 @@ static void nvme_map_cmb(struct nvme_dev *dev)
dev->cmbsz = readl(dev->bar + NVME_REG_CMBSZ);
if (!dev->cmbsz)
return;
+ if (!nvme_cmb_size(dev))
+ return;
dev->cmbloc = readl(dev->bar + NVME_REG_CMBLOC);
- size = nvme_cmb_size_unit(dev) * nvme_cmb_size(dev);
+ unit_size = nvme_cmb_size_unit(dev);
+ if (!unit_size)
+ return;
+ if (check_mul_overflow(unit_size, nvme_cmb_size(dev), &size))
+ return;
offset = nvme_cmb_size_unit(dev) * NVME_CMB_OFST(dev->cmbloc);
bar = NVME_CMB_BIR(dev->cmbloc);
bar_size = pci_resource_len(pdev, bar);
--
2.12.0.windows.1
More information about the Linux-nvme
mailing list