[PATCH v4 08/15] nvme: Implement cross-controller reset recovery
Randy Jennings
randyj at purestorage.com
Thu May 14 19:32:00 PDT 2026
On Fri, Mar 27, 2026 at 5:46 PM Mohamed Khalfella
<mkhalfella at purestorage.com> wrote:
>
> A host that has more than one path connecting to an nvme subsystem
> typically has an nvme controller associated with every path. This is
> mostly applicable to nvmeof. If one path goes down, inflight IOs on that
> path should not be retried immediately on another path because this
> could lead to data corruption as described in TP4129. TP8028 defines
> cross-controller reset mechanism that can be used by host to terminate
> IOs on the failed path using one of the remaining healthy paths. Only
> after IOs are terminated, or long enough time passes as defined by
> TP4129, inflight IOs should be retried on another path. Implement core
> cross-controller reset shared logic to be used by the transports.
>
> Signed-off-by: Mohamed Khalfella <mkhalfella at purestorage.com>
> + now = jiffies;
> + if (time_before(now, deadline))
> + tmo = min_t(unsigned long,
> + secs_to_jiffies(ictrl->kato), deadline - now);
At LSF last week, we talked about skipping logic here and just using
the rest of the time (deadline - now).
The arguments for shortening the time were dubious, so simpler logic
seemed better.
> +
> + if (!wait_for_completion_timeout(&ccr.complete, tmo)) {
> + ret = -ETIMEDOUT;
> + goto out;
> + }
Sincerely,
Randy Jennings
More information about the Linux-nvme
mailing list