nvme: restore use of blk_path_error() in nvme_complete_rq()
Sagi Grimberg
sagi at grimberg.me
Thu Aug 6 21:21:03 EDT 2020
Hey Mike,
>> The point is: blk_path_error() has nothing to do with NVMe errors.
>> This is dm-multipath logic stuck in the middle of the NVMe error
>> handling code.
>
> No, it is a means to have multiple subsystems (to this point both SCSI
> and NVMe) doing the correct thing when translating subsystem specific
> error codes to BLK_STS codes.
Not exactly, don't find any use of this in scsi. The purpose is to make
sure that nvme and dm speak the same language.
> If you, or others you name drop below, understood the point we wouldn't
> be having this conversation. You'd accept the point of blk_path_error()
> to be valid and required codification of what constitutes a retryable
> path error for the Linux block layer.
This incident is a case where the specific nvme status was designed
to retry on the same path respecting the controller retry delay.
And because nvme used blk_path_error at the time it caused us to use a
non-retryable status to get around that. Granted, no one had
dm-multipath in mind.
So in a sense, there is consensus on changing patch 35038bffa87da
_because_ nvme no longer uses blk_path_error. Otherwise it would break.
> Any BLK_STS mapping of NVMe specific error codes would need to not screw
> up by categorizing a retryable error as non-retryable (and vice-versa).
But it is a special type of retryable. There is nothing that fits the
semantics of the current behavior.
> Again, assuming proper testing, commit 35038bffa87 wouldn't have made it
> upstream if NVMe still used blk_path_error().
Agree.
> Yes, your commit 764e9332098c0 needlessly removed NVMe's use of
> blk_path_error(). If you're saying it wasn't needless please explain
> why.
>
>> Fixes: 764e9332098c0 ("nvme-multipath: do not reset on unknown status")
>> Cc: stable at vger.kerneel.org
>> Signed-off-by: Mike Snitzer <snitzer at redhat.com>
>> ---
>> drivers/nvme/host/core.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 6585d57112ad..072f629da4d8 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -290,8 +290,13 @@ void nvme_complete_rq(struct request *req)
>> nvme_req(req)->ctrl->comp_seen = true;
>>
>> if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
>> - if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
>> - return;
>> + if (blk_path_error(status)) {
>> + if (req->cmd_flags & REQ_NVME_MPATH) {
>> + if (nvme_failover_req(req))
>> + return;
>> + /* fallthru to normal error handling */
>> + }
>> + }
>>
>> This would basically undo the patch Hannes, Christoph, and I put together in
>> commit 764e9332098c0. This patch greatly simplified and improved the
>> whole nvme_complete_rq() code path, and I don't support undoing that change.
>
> Please elaborate on how I've undone anything?
I think you're right, this hasn't undone the patch from John, but it
breaks NVME_SC_CMD_INTERRUPTED error handling behavior.
> The only thing I may have done is forced NVMe to take more care about
> properly translating its NVME_SC to BLK_STS in nvme_error_status().
> Which is a good thing.
I don't think there is an issue here of mistakenly converting an nvme
status code to a wrong block status code. This conversion is there
because there is no block status that give us the semantics we need
which is apparently specific to nvme.
I personally don't mind restoring blk_path_error to nvme, I don't
particularly feel strong about it either. But for sure blk_path_error
needs to first provide the semantics needed for NVME_SC_CMD_INTERRUPTED.
...
> Anyway, no new BLK_STS is needed at this point. More discipline with
> how NVMe's error handling is changed is.
Please read the above.
> If NVMe wants to ensure its
> interface isn't broken regularly it _should_ use blk_path_error() to
> validate future nvme_error_status() changes. Miscategorizing NVMe
> errors to upper layers is a bug -- not open for debate.
Again, don't agree is a Miscategorization nor a bug, its just something
that is NVMe specific.
More information about the Linux-nvme
mailing list