Should NVME_SC_INVALID_NS be translated to BLK_STS_IOERR instead of BLK_STS_NOTSUPP so that multipath(both native and dm) can failover on the failure?
Sagi Grimberg
sagi at grimberg.me
Mon Jan 1 01:51:55 PST 2024
> I've tested the patch basing on kernel version 6.6.0. It seems not
> working...
Can you paste the log output (host and controller)?
>
> here's my steps & results:
>
> 1. create a VM and make & install kernel from source applying the patch.
>
> [root at fjr-nvmet-1 ~]# uname -r
> 6.6.0-mytest+
>
> 2. clone that VM.
>
> 3. create a shared volume and attach to the both VMs.
>
> 4. config nvmet as below:
>
> VM1:
>
> o- /
> ......................................................................................................................... [...]
> o- hosts
> ................................................................................................................... [...]
> o- ports
> ................................................................................................................... [...]
> | o- 1 ................................................ [trtype=tcp,
> traddr=192.168.111.99, trsvcid=4420, inline_data_size=262144]
> | o- ana_groups
> .......................................................................................................... [...]
> | | o- 1
> ..................................................................................................... [state=optimized]
> | o- referrals
> ........................................................................................................... [...]
> | o- subsystems
> .......................................................................................................... [...]
> | o-
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 ......................................... [...]
> o- subsystems
> .............................................................................................................. [...]
> o-
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 [version=1.3, allow_any=1, serial=308df2776344fdd17cba]
> o- allowed_hosts
> ....................................................................................................... [...]
> o- namespaces
> .......................................................................................................... [...]
> o- 1 .......................................... [path=/dev/vdc,
> uuid=cf4bb93c-949f-4532-a5c1-b8bd267a4e06, grpid=1, enabled]
>
> VM2:
>
> o- /
> ......................................................................................................................... [...]
> o- hosts
> ................................................................................................................... [...]
> o- ports
> ................................................................................................................... [...]
> | o- 1 ............................................... [trtype=tcp,
> traddr=192.168.111.111, trsvcid=4420, inline_data_size=262144]
> | o- ana_groups
> .......................................................................................................... [...]
> | | o- 1
> ..................................................................................................... [state=optimized]
> | o- referrals
> ........................................................................................................... [...]
> | o- subsystems
> .......................................................................................................... [...]
> | o-
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 ......................................... [...]
> o- subsystems
> .............................................................................................................. [...]
> o-
> nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06 [version=1.3, allow_any=1, serial=0dcf77d36826000cc5a0]
> o- allowed_hosts
> ....................................................................................................... [...]
> o- namespaces
> .......................................................................................................... [...]
> o- 1 .......................................... [path=/dev/vdc,
> uuid=cf4bb93c-949f-4532-a5c1-b8bd267a4e06, grpid=1, enabled]
>
> 5. create a host vm(CentOS 8.1, kernel version
> 4.18.0-147.3.1.el8_1.aarch64), config dm multipath
>
> [root at fjr-vm1 ~]# cat /etc/multipath/conf.d/nvme.conf
> devices {
> device {
> vendor "NVME"
> product "Linux"
> path_selector "round-robin 0"
> path_grouping_policy failover
> uid_attribute ID_SERIAL
> prio "ANA"
> path_checker "none"
> #rr_min_io 100
> #rr_min_io_rq "1"
> #fast_io_fail_tmo 15
> #dev_loss_tmo 600
> #rr_weight uniform
> rr_weight priorities
> failback immediate
> no_path_retry queue
> }
> }
>
>
> 6. connect nvme on host, finally it looks like:
>
> [root at fjr-vm1 ~]# nvme list
> Node SN Model Namespace
> Usage Format FW Rev
> ---------------- --------------------
> ---------------------------------------- ---------
> -------------------------- ---------------- --------
> /dev/nvme0n1 0dcf77d36826000cc5a0
> Linux 1 107.37 GB / 107.37
> GB 512 B + 0 B 6.6.0-my
> /dev/nvme1n1 308df2776344fdd17cba
> Linux 1 107.37 GB / 107.37
> GB 512 B + 0 B 6.6.0-my
>
> [root at fjr-vm1 ~]# nvme list-subsys
> nvme-subsys0 -
> NQN=nqn.2014-08.org.nvmexpress:NVMf:uuid:cf4bb93c-949f-4532-a5c1-b8bd267a4e06
> \
> +- nvme0 tcp traddr=192.168.111.111 trsvcid=4420 live
> +- nvme1 tcp traddr=192.168.111.99 trsvcid=4420 live
>
> [root at fjr-vm1 ~]# multipath -ll
> mpatha (Linux_0dcf77d36826000cc5a0) dm-0 NVME,Linux
> size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
> |-+- policy='round-robin 0' prio=50 status=active
> | `- 1:1:1:1 nvme1n1 259:1 active ready running
> `-+- policy='round-robin 0' prio=50 status=enabled
> `- 0:1:1:1 nvme0n1 259:0 active ready running
>
> 7. execute fio on host, and disable namespace on the vm corresponding to
> nvme1, the same error goes again:
>
> fio: io_u error on file /dev/dm-0: Operation not supported: write
> offset=14734594048, buflen=4096
> fio: io_u error on file /dev/dm-0: Operation not supported: write
> offset=106607394816, buflen=4096
> fio: pid=16076, err=95/file:io_u.c:1747, func=io_u error,
> error=Operation not supported
>
>
>
More information about the Linux-nvme
mailing list