Should NVME_SC_INVALID_NS be translated to BLK_STS_IOERR instead of BLK_STS_NOTSUPP so that multipath(both native and dm) can failover on the failure?
Sagi Grimberg
sagi at grimberg.me
Mon Dec 4 00:47:34 PST 2023
> Hi all,
>
> I have two storage servers, each of which has an NVMe SSD. Recently I'm
> trying nvmet-tcp with DRBD, steps are:
> 1. Configure DRBD for the two SSDs in two-primary mode, so that each
> server can accept IO on DRBD device.
> 2. On each server, add the corresponding DRBD device to nvmet subsystem
> with same device uuid, so that multipath on the host side can group them
> into one device(My fabric type is tcp).
> 3. On client host, nvme discover & connect the both servers, making sure
> DM multipath device is generated, and both paths are online.
> 4. Execute fio randread on DM device continuously.
> 5. On the server whose multipath status is active, under nvmet namespace
> configfs directory, execute "echo 0 > enable" to disable the namespace.
> what I expect is that IO can be automatically retried and switched to
> the other storage server by multipath, fio goes on. But actually I see
> an "Operation not supported" error, and fio fails and stops. I've also
> tried iSCSI target, after I delete mapped lun from acl, fio continues
> running without any error.
>
> My kernel version is 4.18.0-147.5.1(rhel 8.1). After checked out the
> kernel code, I found that:
> 1. On target side, nvmet returns NVME_SC_INVALID_NS to host due to
> namespace not found.
> 2. On host side, nvme driver translates this error to BLK_STS_NOTSUPP
> for block layer.
> 3. Multipath calls for function blk_path_error() to decide whether to
> retry.
> 4. In function blk_path_error(), BLK_STS_NOTSUPP is not considered to be
> a path error, so it returns false, multipath will not retry.
> I've also checked out the master branch from origin, it's almost the
> same. In iSCSI target, the process is similar, the only difference is
> that TCM_NON_EXISTENT_LUN will be translated to BLK_STS_IOERR, which is
> considered to be a path error in function blk_path_error().
>
> So my question is as the subject...Is it reasonable to translate
> NVME_SC_INVALID_NS to BLK_STS_IOERR just like what iSCSI target does?
> Should multipath failover on this error?
The host issued IO to a non-existing namespace. Semantically it is not
an IO error in the sense that its retryable.
btw, AFAICT TCM_NON_EXISTENT_LUN does return an ILLEGAL_REQUEST however
the host chooses to ignore the particular additional sense specifically.
While I guess similar behavior could be done in nvme, the question is
why is a non-existent namespace failure a retryable error? the namespace
is gone...
Thoughts?
Perhaps what you are seeking is a soft way to disable a namespace based
on your test case?
More information about the Linux-nvme
mailing list