[PATCH RFC 4/5] nvme: add sysfs attribute to change IO timeout per nvme controller
Maurizio Lombardi
mlombard at arkamax.eu
Mon Feb 23 02:36:07 PST 2026
On Fri Feb 20, 2026 at 6:53 PM CET, Mohamed Khalfella wrote:
> On Fri 2026-02-20 13:47:08 +0100, Maurizio Lombardi wrote:
>> On Thu Feb 19, 2026 at 6:22 PM CET, Maurizio Lombardi wrote:
>> > On Wed Feb 18, 2026 at 6:54 PM CET, Mohamed Khalfella wrote:
>> >
>> > So changing the timeout field in the tagset should be doable, the
>> > only problem is avoid racing against nvme_alloc_ns().
>> >
>> > I will try to come up with something.
>>
>> I decided to keep the current design, calling blk_queue_rq_timeout()
>> with the namespaces_lock mutex locked is the easiest solution
>>
>> I am sending a V2 in a few moments.
>>
>
> How about restricting changing io timeout to LIVE controllers only. This
> will make the problem easier to solve. Maybe something like the below?
>
> enum nvme_ctrl_state state;
> unsigned long flags;
>
> spin_lock_irqsave(&ctrl->lock, flags);
> state = nvme_ctrl_state(ctrl);
> if (state != NVME_CTRL_LIVE) {
> spin_unlock_irqrestore(&ctrl->lock, flags);
> return -EBUSY
> }
I previously considered this solution, but I discarded it for two reasons:
1) If the user sets the io_timeout too low, the controller ends up
resetting non-stop. With this patch, it becomes hard to catch the
controller in a LIVE state and fix the problem, because the controller is
almost always RESETTING. You essentially have to wait until the driver
gives up and removes the controller.
2) This doesn't prevent nvme_alloc_ns() from racing against the sysfs
path.
Suppose that thread A is executing nvme_alloc_ns() and has just called
disk = blk_mq_alloc_disk(ctrl->tagset, &lim, ns);
immediately after, the user changes io_timeout. Thread B takes the
namespaces_lock, scans the namespaces list and calls
blk_queue_rq_timeout() on each namespaces' queues.
thread A now can take the namespaces_lock and adds the new namespace to
the list, but the problem is that the new namespace is using the stale
timeout setting it inherited from ctrl->tagset.
Maurizio
>
> if (ctrl->queue_count > 1)
> WRITE_ONCE(ctrl->tagset->timeout, timeout);
> spin_unlock_irqrestore(&ctrl->lock, flags);
>
> /* Take the namespaces_lock to avoid racing against nvme_alloc_ns() */
> mutex_lock(&ctrl->namespaces_lock);
>
> ctrl->io_timeout = msecs_to_jiffies(timeout);
> list_for_each_entry(ns, &ctrl->namespaces, list)
> blk_queue_rq_timeout(ns->queue, ctrl->io_timeout);
>
> mutex_unlock(&ctrl->namespaces_lock);
>
>
>> Maurizio
>>
More information about the Linux-nvme
mailing list