[PATCH RFC 3/3] nvme: delay failover by command quiesce timeout

Tue Apr 15 05:11:04 PDT 2025

On Tue, Apr 15, 2025 at 01:28:15AM +0300, Sagi Grimberg wrote:
> > > +void nvme_schedule_failover(struct nvme_ctrl *ctrl)
> > > +{
> > > +	unsigned long delay;
> > > +
> > > +	if (ctrl->cqt)
> > > +		delay = msecs_to_jiffies(ctrl->cqt);
> > > +	else
> > > +		delay = ctrl->kato * HZ;
> > I thought that delay = m * ctrl->kato + ctrl->cqt
> > where m = ctrl->ctratt & NVME_CTRL_ATTR_TBKAS ? 3 : 2
> > no?
> 
> This was said before, but if we are going to always start waiting for kato
> for failover purposes,
> we first need a patch that prevent kato from being arbitrarily long.

That should be addressed with the cross controller reset (CCR). The KATO*n
+ CQT is the upper limit for the target recovery. As soon we have CCR,
the recovery delay is reduced to the time the CCR exchange takes.

> Lets cap kato to something like 10 seconds (which is 2x the default which
> apparently no one is touching).

If I understood the TP4129 the upper limit is now defined, so we don't
have to define our own upper limit.