NetApp Snapmirror Active Sync NVMe target and Linux initiator: Paths disappearing after a power failure of one NetApp a150 system

Thomas Glanzmann thomas at glanzmann.de
Thu Sep 18 09:09:52 PDT 2025


Hello,
today I tried to use two NetApp AFF A150 which consists of two
controllers and 8 disks each with NVMe/TCP. The idea is that you have 2
paths to each controller via two VLANs. If a controller or A150 fails the other
one should take over. That worked, however when the manually power fenced
system came back, I saw that the paths to it where 'chaining' and than
disappearing. However, I could manually reconnect the paths.
I assume that the NVME target of the powerfenced system came up without the
full configuration loaded.  As a result the Linux NVMe initiator droped the
paths.  Is that correct or is it a bug in the Linux NVMe stack. I used Debian
trixie (6.12.43+deb13-cloud-amd64).

- Before I had 8 paths:

(debian-05) [~] nvme list-subsys /dev/nvme4n1
nvme-subsys4 - NQN=nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:a44b1e42-29ec-1c09-c043-741282002d9c
\
 +- nvme10 tcp traddr=10.0.10.238,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme11 tcp traddr=10.0.20.238,trsvcid=4420,src_addr=10.0.20.25 live non-optimized
 +- nvme4 tcp traddr=10.0.10.235,trsvcid=4420,src_addr=10.0.10.25 live optimized
 +- nvme5 tcp traddr=10.0.20.235,trsvcid=4420,src_addr=10.0.20.25 live optimized
 +- nvme6 tcp traddr=10.0.10.236,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme7 tcp traddr=10.0.20.236,trsvcid=4420,src_addr=10.0.20.25 live non-optimized
 +- nvme8 tcp traddr=10.0.10.237,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme9 tcp traddr=10.0.20.237,trsvcid=4420,src_addr=10.0.20.25 live non-optimized

(debian-05) [~] mount /dev/nvme4n1 /mnt
(debian-05) [~] cd /mnt/
(debian-05) [/mnt] while true; do date | tee -a date.log; sync; sleep 1; done
...
Thu Sep 18 16:04:49 CEST 2025
Thu Sep 18 16:04:50 CEST 2025
# 10 seconds for the path failover
Thu Sep 18 16:05:00 CEST 2025
Thu Sep 18 16:05:01 CEST 2025
...

- While the first AFF A150 was offline:

(debian-05) [/mnt] nvme list-subsys /dev/nvme4n1
nvme-subsys4 - NQN=nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:a44b1e42-29ec-1c09-c043-741282002d9c
\
 +- nvme10 tcp traddr=10.0.10.238,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme11 tcp traddr=10.0.20.238,trsvcid=4420,src_addr=10.0.20.25 live non-optimized
 +- nvme4 tcp traddr=10.0.10.235,trsvcid=4420 connecting optimized
 +- nvme5 tcp traddr=10.0.20.235,trsvcid=4420 connecting optimized
 +- nvme6 tcp traddr=10.0.10.236,trsvcid=4420 connecting non-optimized
 +- nvme7 tcp traddr=10.0.20.236,trsvcid=4420 connecting non-optimized
 +- nvme8 tcp traddr=10.0.10.237,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme9 tcp traddr=10.0.20.237,trsvcid=4420,src_addr=10.0.20.25 live non-optimized

- Than I don't have the output because I ran it in watch, it said 'changing' for the failed paths.

- And than the paths disappeared completely and did not come back.

(debian-05) [/mnt] nvme list-subsys /dev/nvme4n1
nvme-subsys4 - NQN=nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:a44b1e42-29ec-1c09-c043-741282002d9c
\
 +- nvme10 tcp traddr=10.0.10.238,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme11 tcp traddr=10.0.20.238,trsvcid=4420,src_addr=10.0.20.25 live non-optimized
 +- nvme8 tcp traddr=10.0.10.237,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme9 tcp traddr=10.0.20.237,trsvcid=4420,src_addr=10.0.20.25 live non-optimized

- Than I manually reconnected:

(debian-05) [/mnt] nvme connect -t tcp -a 10.0.10.235 -s 4420 -n nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem -i 14
(debian-05) [/mnt] nvme connect -t tcp -a 10.0.20.235 -s 4420 -n nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem -i 14
(debian-05) [/mnt] nvme connect -t tcp -a 10.0.10.236 -s 4420 -n nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem -i 14
(debian-05) [/mnt] nvme connect -t tcp -a 10.0.20.236 -s 4420 -n nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem -i 14

(debian-05) [/mnt] nvme list-subsys /dev/nvme4n1
nvme-subsys4 - NQN=nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:a44b1e42-29ec-1c09-c043-741282002d9c
\
 +- nvme10 tcp traddr=10.0.10.238,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme11 tcp traddr=10.0.20.238,trsvcid=4420,src_addr=10.0.20.25 live non-optimized
 +- nvme12 tcp traddr=10.0.20.236,trsvcid=4420,src_addr=10.0.20.25 live non-optimized
 +- nvme5 tcp traddr=10.0.10.235,trsvcid=4420,src_addr=10.0.10.25 live optimized
 +- nvme6 tcp traddr=10.0.20.235,trsvcid=4420,src_addr=10.0.20.25 live optimized
 +- nvme7 tcp traddr=10.0.10.236,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme8 tcp traddr=10.0.10.237,trsvcid=4420,src_addr=10.0.10.25 live non-optimized
 +- nvme9 tcp traddr=10.0.20.237,trsvcid=4420,src_addr=10.0.20.25 live non-optimizedo

In dmesg I saw:

[90303.743154] nvme nvme6: Reconnecting in 10 seconds...
[90311.695617] nvme nvme4: Connect Invalid Data Parameter, subsysnqn "nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem"
[90311.695714] nvme nvme4: failed to connect queue: 0 ret=16770
[90311.695754] nvme nvme4: Failed reconnect attempt 1/-1
[90311.695775] nvme nvme5: Connect Invalid Data Parameter, subsysnqn "nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem"
[90311.695789] nvme nvme4: Removing controller (16770)...
[90311.695876] nvme nvme5: failed to connect queue: 0 ret=16770
[90311.695909] nvme nvme4: Removing ctrl: NQN "nqn.1992-08.com.netapp:sn.347a30cc947511f083a6d039ea2800fc:subsystem.subsystem"

full dmesg: https://tg.st/u/297bb062eca5c9d05b533691ce98c2fffc31deafe45fe399d7492381b47313bb.txt

Cheers,
	Thomas



More information about the Linux-nvme mailing list