[PATCH v3] nvme-pci: fix CMB mapping when CMBSZ Size field is zero

kangfenglong kangfenglong at huawei.com
Sun Jun 21 19:41:29 PDT 2026


The controller memory buffer size is defined by the SZ field in the
CMBSZ register (bits 31:12). According to the NVMe specification, a
value of zero in the SZ field indicates that no CMB is present.

Commit f65efd6dfe4e ("nvme-pci: clean up CMB initialization") replaced
the check for a zero SZ field with a check for a zero CMBSZ register
value, under the assumption that a zero register implies no CMB.
However, a CMBSZ register can be non-zero while the SZ field is zero,
for example when Size Units (SZU) is set to a non-zero value but the
actual size is zero (e.g. CMBSZ = 0x100: SZU = 1, SZ = 0).

When this happens, nvme_map_cmb() proceeds to compute a size of zero,
passes the alignment checks (zero is always aligned), and calls
pci_p2pdma_add_resource() with size=0. The P2PDMA subsystem then
defaults size to the entire remaining BAR, which may not be properly
aligned for memory hotplug, triggering:

  WARNING: CPU: 0 PID: 10 at mm/memory_hotplug.c:396 __add_pages
  Misaligned __add_pages start: 0x494a0000 end: 0x494a000f

Fix this by:
1. Restoring the check for a zero SZ field, which explicitly verifies
   that the controller actually has a CMB before proceeding.
2. Adding a boundary check in nvme_cmb_size_unit() to reject SZU values
   greater than 6. The SZU field is 4 bits, but values 7-15 are reserved
   by the NVMe specification and produce a shift count >= 40 in the
   left-shift operation (1ULL << (12 + 4 * szu) when szu >= 7), which
   exceeds the maximum 64 GB unit size defined by the spec.
3. Checking the return value of nvme_cmb_size_unit() for zero before
   proceeding, and using check_mul_overflow() for the size calculation
   to detect integer overflow that could wrap to a non-zero value and
   bypass the SZ check.

Fixes: f65efd6dfe4e ("nvme-pci: clean up CMB initialization")
Signed-off-by: kangfenglong <kangfenglong at huawei.com>
---
 drivers/nvme/host/pci.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8438c904ec49..ee2d782f4957 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2448,6 +2448,9 @@ static u64 nvme_cmb_size_unit(struct nvme_dev *dev)
 {
 	u8 szu = (dev->cmbsz >> NVME_CMBSZ_SZU_SHIFT) & NVME_CMBSZ_SZU_MASK;
 
+	if (szu > 6)
+		return 0;
+
 	return 1ULL << (12 + 4 * szu);
 }
 
@@ -2458,7 +2461,7 @@ static u32 nvme_cmb_size(struct nvme_dev *dev)
 
 static void nvme_map_cmb(struct nvme_dev *dev)
 {
-	u64 size, offset;
+	u64 size, unit_size, offset;
 	resource_size_t bar_size;
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	int bar;
@@ -2472,9 +2475,15 @@ static void nvme_map_cmb(struct nvme_dev *dev)
 	dev->cmbsz = readl(dev->bar + NVME_REG_CMBSZ);
 	if (!dev->cmbsz)
 		return;
+	if (!nvme_cmb_size(dev))
+		return;
 	dev->cmbloc = readl(dev->bar + NVME_REG_CMBLOC);
 
-	size = nvme_cmb_size_unit(dev) * nvme_cmb_size(dev);
+	unit_size = nvme_cmb_size_unit(dev);
+	if (!unit_size)
+		return;
+	if (check_mul_overflow(unit_size, nvme_cmb_size(dev), &size))
+		return;
 	offset = nvme_cmb_size_unit(dev) * NVME_CMB_OFST(dev->cmbloc);
 	bar = NVME_CMB_BIR(dev->cmbloc);
 	bar_size = pci_resource_len(pdev, bar);
-- 
2.12.0.windows.1




More information about the Linux-nvme mailing list