Increase maxsize of io_queue_depth for nvme driver?

Apachez apachez at gmail.com
Fri Sep 19 01:13:46 PDT 2025


On Fri, Sep 19, 2025 at 2:29 AM Chaitanya Kulkarni
<chaitanyak at nvidia.com> wrote:
>
> On 9/14/25 7:31 AM, Keith Busch wrote:
>
> On Sun, Sep 14, 2025 at 02:45:35PM +0200, Apachez wrote:
>
> I would like to propose that NVME_PCI_MAX_QUEUE_SIZE should be
> increased from 4095 to 32767 to match the current NVMe specification
> regarding MQES and to give the sysop ability to fully utilize the
> performance of the hardware being used, or am I missing something
> here?
>
> We use the upper bits of the command id to detect duplicate completions,
> so we don't have enough bits to tag commands beyond 4095. As far as I
> know, though, we haven't seen such breakage in a *long* time. It's more
> of a sanity thing to know with high certainty that duplicate completions
> are not occurring.
>
> But I don't think you'll see any performance difference by increasing
> the queue depth. Devices saturate the link at far lower already, so
> going higher just increases completion latency.
>
> Apachez, can you share the performance numbers where it shows clear win for the
>
> performance difference woth above mentioned queue depth number ?
>
> -ck


Hi,

I currently dont have any hard data if io_queue_depth larger than 4095
would improve
anything other than that the NVMe specification defines MQES as a 15 bit number
meaning (as I interpret it) that the NVME_PCI_MAX_QUEUE_SIZE should be max
size 32768 rather than 4095.

And one of the features with NVMe storage over HDD or SSD is support of a larger
queue size.

Its often claimed to be something like:

SATA <= 32, SAS <= 256, NVMe <= 65535.

As example Micron 7450 MAX states in their datasheet that the drive
supports 8192 as
queue size but Linux currently only let me set this to 4095 as max size.

Using a large io_queue_depth doesnt seem to hurt on above NVMe drives.

Im currently using this for ZFS and NVMe:

# Set maximum number of I/Os active to each device
# Should be equal or greater than sum of each queues *_max_active
# Normally SATA <= 32, SAS <= 256, NVMe <= 65535.
# To find out supported max queue for NVMe:
# nvme show-regs -H /dev/nvmeX | grep -i 'Maximum Queue Entries Supported'
# For NVMe should match /sys/module/nvme/parameters/io_queue_depth
# nvme.io_queue_depth limits are >= 2 and <= 4095
options zfs zfs_vdev_max_active=4095
options nvme io_queue_depth=4095

Kind Regards
Apachez



More information about the Linux-nvme mailing list