[PATCH v3] nvme: fix identify error status silent ignore
Sagi Grimberg
sagi at grimberg.me
Fri Jun 26 13:46:29 EDT 2020
Patch 59c7c3caaaf8 intended to only silently ignore
non retry-able errors (DNR bit set) such that we can still
identify misbehaving controllers, and in the other hand
propagate retry-able errors (DNR bit cleared) so we don't
wrongly abandon a namespace just because it happens to be
temporarily inaccessible.
The goal remains the same as the original commit where this
was introduced but unfortunately had the logic backwards.
Fixes: 59c7c3caaaf8 ("nvme: fix possible hang when ns
scanning fails during error recovery")
Reported-by: Keith Busch <kbusch at kernel.org>
Reviewed-by: Keith Busch <kbusch at kernel.org>
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
Changes from v2:
- added comment on non-trivial code
Changes from v1:
- remove paranthesis
drivers/nvme/host/core.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2afed32d3892..92dc2327bf3a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1128,9 +1128,15 @@ static int nvme_identify_ns_descs(struct nvme_ctrl *ctrl, unsigned nsid,
"Identify Descriptors failed (%d)\n", status);
/*
* Don't treat an error as fatal, as we potentially already
- * have a NGUID or EUI-64.
+ * have a NGUID or EUI-64. If we failed with DNR set, we want
+ * to silently ignore the error as we can still identify
+ * the device, but if the status has DNR set, we want
+ * to propogate the error back specifically for the disk
+ * revalidation flow to make sure we don't abandon the
+ * device just because of a temporal retry-able error (such
+ * as path of transport errors).
*/
- if (status > 0 && !(status & NVME_SC_DNR))
+ if (status > 0 && status & NVME_SC_DNR)
status = 0;
goto free_data;
}
--
2.25.1
More information about the Linux-nvme
mailing list