[PATCH v2] nvme: Let the blocklayer set timeouts for requests
Sagi Grimberg
sagi at grimberg.me
Fri Dec 12 05:43:17 PST 2025
On 05/12/2025 9:33, Heyne, Maximilian wrote:
> On Thu, Dec 04, 2025 at 07:34:26PM +0000, Heyne, Maximilian wrote:
>> On Thu, Dec 04, 2025 at 09:13:35AM -0700, Keith Busch wrote:
>>> On Thu, Dec 04, 2025 at 02:11:50PM +0000, Heyne, Maximilian wrote:
>>>> @@ -724,10 +724,8 @@ void nvme_init_request(struct request *req, struct nvme_command *cmd)
>>>> struct nvme_ns *ns = req->q->disk->private_data;
>>>>
>>>> logging_enabled = ns->head->passthru_err_log_enabled;
>>>> - req->timeout = NVME_IO_TIMEOUT;
>>>> } else { /* no queuedata implies admin queue */
>>>> logging_enabled = nr->ctrl->passthru_err_log_enabled;
>>>> - req->timeout = NVME_ADMIN_TIMEOUT;
>>>> }
>>> I was trying to think of any in-kernel path using __submit_sync_cmd with
>>> an IO queue, and quick search shows there's just one: zns report zones.
>>>
>>> Everything else uses the admin queue, which doesn't have a sysfs tunable
>>> for its request_queue's default timeout. All we have is the nvme module
>>> parameter, which is writable after loading. Since that's the only way a
>>> user can modify the default time for that queue, I think we need to
>>> leave that req->timeout value as-is.
>> Ok sound like a v3 is needed where I only delete the line with
>> NVME_IO_TIMEOUT but leave the NVME_ADMIN_TIMEOUT and add a comment about
>> it. Will prepare such a patch.
>>
> I thought about this a bit more. Considering that the module parameters
> can be written to produces even more inconsistencies.
>
> With my proposed change we will have the following situation:
> 1) when a device is probed current settings for IO and admin timeout from
> the module parameters will be the respective default timeouts
> 2) almost all admin commands to the device will default to the initial
> admin timeout
> 3) almost all IO commands will adhere to whatever is set via sysfs or
> the initial default
>
> -> changes to the module parameters will (mostly) not affect already
> probed devices
>
> When we keep initializing the admin timeouts to the module parameter as
> you suggest (so half of my patch), we'll have the following situation:
>
> 1) same as above
> 2) some admin commands will use the module parameters admin timeout,
> some (ioctl's via nvme_submit_user_cmd) will use the admin queue's timeout set in 1).
> This can be made consistent if we fix nvme_submit_user_cmd.
> 3) same as above; IO timeout not affected by module parameter changes.
>
> -> changes to the module parameters affect only the admin commands but
> (mostly) not the IO commands which is inconsistent behavior.
>
> Therefore, I wonder whether people are actually making changes to the
> module parameters at runtime to adjust all device's admin timeouts. In
> that case it's at least broken for ioctl's currently.
> Should we make that timeout runtime configurable per-device too instead?
>
> In summary, I think my patch as-is leads to better consistency.
Perhaps you can simply add admin_timeout sysfs file for the controller
that would alter
the set->timeout and take the value from there in nvme_init_request
More information about the Linux-nvme
mailing list