[PATCHv3 3/5] nvme: implement I/O Command Sets Command Set support

Sagi Grimberg sagi at grimberg.me
Wed Jun 24 13:46:03 EDT 2020



On 6/24/20 10:25 AM, Keith Busch wrote:
> On Tue, Jun 23, 2020 at 04:17:30PM -0700, Sagi Grimberg wrote:
>>
>>>>>> -		len = nvme_process_ns_desc(ctrl, ids, cur);
>>>>>> +		len = nvme_process_ns_desc(ctrl, ids, cur, &csi_seen);
>>>>>>     		if (len < 0)
>>>>>>     			goto free_data;
>>>>>>     		len += sizeof(*cur);
>>>>>>     	}
>>>>>>     free_data:
>>>>>> +	if (!status && nvme_multi_css(ctrl) && !csi_seen) {
>>>>>
>>>>> We will clear the status if we detect a path error, that is to
>>>>> avoid needlessly removing the ns for path failures, so you should
>>>>> check at the goto site.
>>>>
>>>> The problem is that this check has to be done after checking all the ns descs,
>>>> so this check to be done as the final thing, at least after processing all the
>>>> ns descs. No matter if nvme_process_ns_desc() returned an error, or if
>>>> simply NVME_NIDT_CSI wasn't part of the ns desc list, so the loop reached the
>>>> end without error.
>>>>
>>>> Even if the nvme command failed and the status was cleared:
>>>>
>>>>                   if (status > 0 && !(status & NVME_SC_DNR))
>>>>                           status = 0;
>>>
>>> This check is so weird. What has DNR got to do with whether or not we
>>> want to continue with this namespace? The commit that adds this says
>>> it's to check for a host failed IO, but a controller can just as validly
>>> set DNR in its error status, in which case we'd still want clear the
>>> status.
>>
>> The reason is that if this error is not a DNR error, it means that we
>> should retry and succeed, we don't want to remove the namespace.
> 
> And what if it is a DNR error? For example, the controller simply
> doesn't support this CNS value. While the controller should support it,
> we definitely don't need it for NVM command set namespaces.

Maybe I mis-undersatnd the comment, but if you see a DNR error, it means
that the controller replied an error and its final, so then you can have
extra checks.

I think the point here is that if we got a non-dnr status, we should not
take any actions on this namespace because we need to retry first
(either because the controller is unavailable or it needs the host to
retry for another reason).

>> The problem that this solves is the fact that we have various error
>> recovery conditions (interconnect issues, failures, resets) and the
>> async works are designed to continue to run, issue I/O and fail. The
>> scan work, will revalidate namespaces and remove them if it fails to
>> do so, which is inherently wrong if these are simply inaccessible by
>> the host at this time.
> 
> Relying on a specific bit in the status code seems a bit indirect for
> this kind of condition.

I actually think this approach covers exactly what we are trying to
achieve. But I'll let others comment on this.



More information about the Linux-nvme mailing list