[RFC PATCH 08/14] nvme: Implement cross-controller reset recovery
Sagi Grimberg
sagi at grimberg.me
Sun Jan 4 13:14:38 PST 2026
On 31/12/2025 2:04, Randy Jennings wrote:
>>> +
>>> + if (!ret) {
>>> + dev_info(ictrl->device, "CCR succeeded using %s\n",
>>> + dev_name(sctrl->device));
>>> + blk_put_queue(sctrl->admin_q);
>>> + nvme_put_ctrl(sctrl);
>>> + return 0;
>>> + }
>>> +
>>> + /* Try another controller */
>>> + min_cntlid = sctrl->cntlid + 1;
>> OK, I see why min_cntlid is used. That is very non-intuitive.
>>
>> I'm wandering if it will be simpler to take one-shot at ccr and
>> if it fails fallback to crt. I mean, if the sctrl is alive, and it was
>> unable
>> to reset the ictrl in time, how would another ctrl do a better job here?
> There are many different kinds of failures we are dealing with here
> that result in a dropped connection (association). It could be a problem
> with the specific link, or it could be that the node of an HA pair in the
> storage array went down. In the case of a specific link problem, maybe
> only one of the connections is down and any controller would work.
> In the case of the node of an HA pair, roughly half of the connections
> are going down, and there is a race between the controllers which
> are detected down first. There were some heuristics put into the
> spec about deciding which controller to use, but that is more code
> and a refinement that could come later (and they are still heuristics;
> they may not be helpful).
>
> Because CCR offers a significant win of shortening the recovery time
> substantially, it is worth retrying on the other controllers. This time
> affects when we can start retrying IO. KATO is in seconds, and
> NVMEoF should have the capability of doing a significant amount of
> IOs in each of those seconds.
But it doesn't actually do I/O, it issues I/O and then wait for it to
time out.
>
> Besides, the alternative is just to wait. Might as well be actively trying
> to shorten that wait time. Besides a small increase in code complexity,
> is there a downside to doing so?
Simplicity is very important when it comes to non-trivial code paths
like error recovery.
More information about the Linux-nvme
mailing list