[PATCHv3 3/5] nvme: implement I/O Command Sets Command Set support
Keith Busch
kbusch at kernel.org
Wed Jun 24 13:25:09 EDT 2020
On Tue, Jun 23, 2020 at 04:17:30PM -0700, Sagi Grimberg wrote:
>
> > > > > - len = nvme_process_ns_desc(ctrl, ids, cur);
> > > > > + len = nvme_process_ns_desc(ctrl, ids, cur, &csi_seen);
> > > > > if (len < 0)
> > > > > goto free_data;
> > > > > len += sizeof(*cur);
> > > > > }
> > > > > free_data:
> > > > > + if (!status && nvme_multi_css(ctrl) && !csi_seen) {
> > > >
> > > > We will clear the status if we detect a path error, that is to
> > > > avoid needlessly removing the ns for path failures, so you should
> > > > check at the goto site.
> > >
> > > The problem is that this check has to be done after checking all the ns descs,
> > > so this check to be done as the final thing, at least after processing all the
> > > ns descs. No matter if nvme_process_ns_desc() returned an error, or if
> > > simply NVME_NIDT_CSI wasn't part of the ns desc list, so the loop reached the
> > > end without error.
> > >
> > > Even if the nvme command failed and the status was cleared:
> > >
> > > if (status > 0 && !(status & NVME_SC_DNR))
> > > status = 0;
> >
> > This check is so weird. What has DNR got to do with whether or not we
> > want to continue with this namespace? The commit that adds this says
> > it's to check for a host failed IO, but a controller can just as validly
> > set DNR in its error status, in which case we'd still want clear the
> > status.
>
> The reason is that if this error is not a DNR error, it means that we
> should retry and succeed, we don't want to remove the namespace.
And what if it is a DNR error? For example, the controller simply
doesn't support this CNS value. While the controller should support it,
we definitely don't need it for NVM command set namespaces.
> The problem that this solves is the fact that we have various error
> recovery conditions (interconnect issues, failures, resets) and the
> async works are designed to continue to run, issue I/O and fail. The
> scan work, will revalidate namespaces and remove them if it fails to
> do so, which is inherently wrong if these are simply inaccessible by
> the host at this time.
Relying on a specific bit in the status code seems a bit indirect for
this kind of condition.
More information about the Linux-nvme
mailing list