[PATCH V3 0/8] nvme: Refactor and expose per-controller timeout configuration
Hannes Reinecke
hare at suse.de
Mon Apr 13 01:12:46 PDT 2026
On 4/10/26 09:39, Maurizio Lombardi wrote:
> This patchset tries to address some limitations in how the NVMe driver handles
> command timeouts.
> Currently, the driver relies heavily on global module parameters
> (NVME_IO_TIMEOUT and NVME_ADMIN_TIMEOUT), making it difficult for users to
> tune timeouts for specific controllers that may have very different
> characteristics. Also, in some cases, manual changes to sysfs timeout values
> are ignored by the driver logic.
>
> For example this patchset removes the unconditional timeout assignment in
> nvme_init_request. This allows the block layer to correctly apply the request
> queue's timeout settings, ensuring that user-initiated changes via sysfs
> are actually respected for all requests.
>
> It introduces new sysfs attributes (admin_timeout and io_timeout) to the NVMe
> controller. This allows users to configure distinct timeout requirements for
> different controllers rather than relying on global module parameters.
>
> Some examples:
>
> Changes to the controller's io_timeout gets propagated to all
> the associated namespaces' queues:
>
> # find /sys -name 'io_timeout'
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n1/queue/io_timeout
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n2/queue/io_timeout
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n3/queue/io_timeout
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/io_timeout
>
> # echo 27000 > /sys/devices/virtual/nvme-fabrics/ctl/nvme0/io_timeout
> # cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n1/queue/io_timeout
> 27000
> # cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n2/queue/io_timeout
> 27000
> # cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n3/queue/io_timeout
> 27000
>
> When adding a namespace target-side, the io_timeout is inherited from
> the controller's preferred timeout:
>
> * target side *
> # nvmetcli
> /> cd subsystems/test-nqn/namespaces/4
> /subsystems/t.../namespaces/4> enable
> The Namespace has been enabled.
>
> ************
>
> * Host-side *
> nvme nvme0: rescanning namespaces.
> # find /sys -name 'io_timeout'
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n1/queue/io_timeout
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n2/queue/io_timeout
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n3/queue/io_timeout
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n4/queue/io_timeout <-- new namespace
> /sys/devices/virtual/nvme-fabrics/ctl/nvme0/io_timeout
>
> # cat /sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0c0n4/queue/io_timeout
> 27000
> ***********
>
> io_timeout and admin_timeout module parameters are used as default
> values for new controllers:
>
> # nvme connect -t tcp -a 10.37.153.138 -s 8000 -n test-nqn2
> connecting to device: nvme1
>
> # cat /sys/devices/virtual/nvme-fabrics/ctl/nvme1/nvme1c1n1/queue/io_timeout
> 30000
> # cat /sys/devices/virtual/nvme-fabrics/ctl/nvme1/admin_timeout
> 60000
>
> V3: - Rebase on top of nvme 7.1 branch
> - add an admin_timeout variable to nvme_ctrl structure
> - Wait until the controller has reached the LIVE state
> for the first time before allowing the user to modify
> the timeouts, this prevents the dereferencing of the admin_q,
> fabrics_q and connect_q queues before their initialization.
> - move blk_put_queue(fabrics_q) to nvme_free_ctrl() to align it to
> admin_q teardown
> - modify nvmet-loop to avoid deleting and re-creating the admin_q queue
> when the controller enters the resetting state.
> - add a warning if nvme_alloc_admin_tag_set() is called twice against
> the same controller
>
> V2: - Drop the RFC tag
> - apply the timeout settings to fabrics_q and connect_q too
> - Code style fixes
> - remove unnecessary check for null admin_q in __nvme_delete_io_queues()
> - Use DEVICE_ATTR() macro
>
What about KATO?
With this patchset the user can set arbitrary values to the I/O timeout,
which easily can be lower than KATO.
And as per spec a KATO timeout implies a transport disruption, requiring
a controller reset.
But due to the internal design of the nvme error handling we do conflate
transport disruption and command timeout, so an _I/O_ timeout triggers
a controller reset.
Which means that a command timeout lower than KATO will result in false
positives, with the controller being reset even though the connection
is perfectly happy.
If you were to go that way of making the I/O timeout configurable I
strongly suggest to implement command abort first (hello John ;-),
as the you can simply cancel a command which ran into a timeout
without having to reset the controller.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
More information about the Linux-nvme
mailing list