[PATCH] nvme-sysfs: display max_hw_sectors_kb without requiring namespaces
Abhishek Bapat
abhishekbapat at google.com
Wed Oct 16 14:31:08 PDT 2024
From: Abhishek <abhishekbapat at google.com>
The initialization of the max_hw_sectors_kb value is performed by the
NVMe driver through the invocation of the NVMe Identify Controller
command, followed by the subsequent retrieval of the MDTS (Max Data
Transfer Size) field. Commit 3710e2b056cb ("nvme-pci: clamp
max_hw_sectors based on DMA optimized limitation") introduced a
limitation on the value of max_hw_sectors_kb, restricting it to 128KiB
(MDTS = 5). This restricion was implemented to mitigate lockups
encountered in high-core count AMD servers.
Currently, user space applications have two options for obtaining the
max_hw_sectors_kb value to determine the payload size of the NVMe
command they wish to issue. They can either execute the Identify
Controller command or query the kernel. In instances where the
underlying NVMe device supports MDTS > 5 (128KiB), the user space
application can potentially create an NVMe command with a payload size
greater than 128KiB, if it fetches the MDTS value through the Identify
Controller command. However, this would result in an Invalid Argument
(-EINVAL) kernel error, preventing the application from issuing the
required command through any of the kernel supported I/O API. Presently,
the kernel exposes max_hw_sectors_kb value through a queue sysfs file.
However, this file is only present for an NVMe device if a namespace has
been created on the same NVMe device, necessitating the existence of a
namespace to query the value of max_hw_sectors_kb. This dependency is
semantically incorrect as MDTS is a controller-associated field (section
5.1.13, NVMe specification 2.1) and should be accessible regardless of
the presence of a namespace on the NVMe device.
Expose the value of max_hw_sectors_kb through NVMe sysfs to remove the
dependency of having a namespace on the device before accessing its
value.
Signed-off-by: Abhishek <abhishekbapat at google.com>
---
drivers/nvme/host/sysfs.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/nvme/host/sysfs.c b/drivers/nvme/host/sysfs.c
index b68a9e5f1ea3..1af2b2cf1a6c 100644
--- a/drivers/nvme/host/sysfs.c
+++ b/drivers/nvme/host/sysfs.c
@@ -546,6 +546,17 @@ static ssize_t dctype_show(struct device *dev,
}
static DEVICE_ATTR_RO(dctype);
+static ssize_t max_hw_sectors_kb_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct nvme_ctrl *ctrl = dev_get_drvdata(dev);
+ u32 max_hw_sectors_kb = ctrl->max_hw_sectors >> 1;
+
+ return sysfs_emit(buf, "%u\n", max_hw_sectors_kb);
+}
+static DEVICE_ATTR_RO(max_hw_sectors_kb);
+
#ifdef CONFIG_NVME_HOST_AUTH
static ssize_t nvme_ctrl_dhchap_secret_show(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -687,6 +698,7 @@ static struct attribute *nvme_dev_attrs[] = {
&dev_attr_kato.attr,
&dev_attr_cntrltype.attr,
&dev_attr_dctype.attr,
+ &dev_attr_max_hw_sectors_kb.attr,
#ifdef CONFIG_NVME_HOST_AUTH
&dev_attr_dhchap_secret.attr,
&dev_attr_dhchap_ctrl_secret.attr,
--
2.47.0.rc1.288.g06298d1525-goog
More information about the Linux-nvme
mailing list