Should NVME_SC_INVALID_NS be translated to BLK_STS_IOERR instead of BLK_STS_NOTSUPP so that multipath(both native and dm) can failover on the failure?

Sagi Grimberg sagi at grimberg.me
Mon Jan 1 01:51:55 PST 2024


> I've tested the patch basing on kernel version 6.6.0. It seems not 
> working...

Can you paste the log output (host and controller)?

> 
> here's my steps & results:
> 
> 1. create a VM and make & install kernel from source applying the patch.
> 
> [root at fjr-nvmet-1 ~]# uname -r
> 6.6.0-mytest+
> 
> 2. clone that VM.
> 
> 3. create a shared volume and attach to the both VMs.
> 
> 4. config nvmet as below:
> 
> VM1:
> 
> o- / 
> ......................................................................................................................... [...]
>    o- hosts 
> ................................................................................................................... [...]
>    o- ports 
> ................................................................................................................... [...]
>    | o- 1 ................................................ [trtype=tcp, 
> traddr=192.168.111.99, trsvcid=4420, inline_data_size=262144]
>    |   o- ana_groups 
> .......................................................................................................... [...]
>    |   | o- 1 
> ..................................................................................................... [state=optimized]
>    |   o- referrals 
> ........................................................................................................... [...]
>    |   o- subsystems 
> .......................................................................................................... [...]
>    |     o- 
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 ......................................... [...]
>    o- subsystems 
> .............................................................................................................. [...]
>      o- 
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 [version=1.3, allow_any=1, serial=308df2776344fdd17cba]
>        o- allowed_hosts 
> ....................................................................................................... [...]
>        o- namespaces 
> .......................................................................................................... [...]
>          o- 1 .......................................... [path=/dev/vdc, 
> uuid=cf4bb93c-949f-4532-a5c1-b8bd267a4e06, grpid=1, enabled]
> 
> VM2:
> 
> o- / 
> ......................................................................................................................... [...]
>    o- hosts 
> ................................................................................................................... [...]
>    o- ports 
> ................................................................................................................... [...]
>    | o- 1 ............................................... [trtype=tcp, 
> traddr=192.168.111.111, trsvcid=4420, inline_data_size=262144]
>    |   o- ana_groups 
> .......................................................................................................... [...]
>    |   | o- 1 
> ..................................................................................................... [state=optimized]
>    |   o- referrals 
> ........................................................................................................... [...]
>    |   o- subsystems 
> .......................................................................................................... [...]
>    |     o- 
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 ......................................... [...]
>    o- subsystems 
> .............................................................................................................. [...]
>      o- 
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 [version=1.3, allow_any=1, serial=0dcf77d36826000cc5a0]
>        o- allowed_hosts 
> ....................................................................................................... [...]
>        o- namespaces 
> .......................................................................................................... [...]
>          o- 1 .......................................... [path=/dev/vdc, 
> uuid=cf4bb93c-949f-4532-a5c1-b8bd267a4e06, grpid=1, enabled]
> 
> 5. create a host vm(CentOS 8.1, kernel version 
> 4.18.0-147.3.1.el8_1.aarch64), config dm multipath
> 
> [root at fjr-vm1 ~]# cat /etc/multipath/conf.d/nvme.conf
> devices {
>      device {
>                  vendor                  "NVME"
>                  product                 "Linux"
>                  path_selector           "round-robin 0"
>                  path_grouping_policy    failover
>                  uid_attribute           ID_SERIAL
>                  prio                    "ANA"
>                  path_checker            "none"
>                  #rr_min_io               100
>                  #rr_min_io_rq            "1"
>                  #fast_io_fail_tmo        15
>                  #dev_loss_tmo            600
>                  #rr_weight               uniform
>                  rr_weight               priorities
>                  failback                immediate
>                  no_path_retry           queue
>      }
> }
> 
> 
> 6. connect nvme on host, finally it looks like:
> 
> [root at fjr-vm1 ~]# nvme list
> Node             SN Model                                    Namespace 
> Usage                      Format           FW Rev
> ---------------- -------------------- 
> ---------------------------------------- --------- 
> -------------------------- ---------------- --------
> /dev/nvme0n1     0dcf77d36826000cc5a0 
> Linux                                    1         107.37  GB / 107.37 
> GB    512   B +  0 B   6.6.0-my
> /dev/nvme1n1     308df2776344fdd17cba 
> Linux                                    1         107.37  GB / 107.37 
> GB    512   B +  0 B   6.6.0-my
> 
> [root at fjr-vm1 ~]# nvme list-subsys
> nvme-subsys0 - 
> NQN=nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06
> \
>   +- nvme0 tcp traddr=192.168.111.111 trsvcid=4420 live
>   +- nvme1 tcp traddr=192.168.111.99 trsvcid=4420 live
> 
> [root at fjr-vm1 ~]# multipath -ll
> mpatha (Linux_0dcf77d36826000cc5a0) dm-0 NVME,Linux
> size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
> |-+- policy='round-robin 0' prio=50 status=active
> | `- 1:1:1:1 nvme1n1 259:1 active ready running
> `-+- policy='round-robin 0' prio=50 status=enabled
>    `- 0:1:1:1 nvme0n1 259:0 active ready running
> 
> 7. execute fio on host, and disable namespace on the vm corresponding to 
> nvme1, the same error goes again:
> 
> fio: io_u error on file /dev/dm-0: Operation not supported: write 
> offset=14734594048, buflen=4096
> fio: io_u error on file /dev/dm-0: Operation not supported: write 
> offset=106607394816, buflen=4096
> fio: pid=16076, err=95/file:io_u.c:1747, func=io_u error, 
> error=Operation not supported
> 
> 
> 



More information about the Linux-nvme mailing list