nvme: subsystem handle not released on hot removal under raid1 configuration and multipath enabled
Rajashekar, Revanth
revanth.rajashekar at intel.com
Mon Nov 16 19:06:54 EST 2020
Hi all,
I'm trying to run a few tests on multipath capable nvme devices under raid-1 configuration.
When I try to hot remove a device via sysfs, only the block device is removed while the subsystem handle is still in-tact.
Before hot removal:
===================
# nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 PHKD9325000C375AGN INTEL SSDPD21K375GA 1 375.08 GB / 375.08 GB 512 B + 0 B E2010453
/dev/nvme1n1 PHKD9325000N1P5CGN INTEL SSDPD21K015TA 1 1.50 TB / 1.50 TB 512 B + 0 B E2010453
# ll /sys/block
md127 -> ../devices/virtual/block/md127
nvme0c0n1 -> ../devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:01.0/0000:67:00.0/nvme/nvme0/nvme0c0n1
nvme0n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys0/nvme0n1
nvme1c1n1 -> ../devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:02.0/0000:68:00.0/nvme/nvme1/nvme1c1n1
nvme1n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys1/nvme1n1
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:1 0 349.3G 0 disk
└─md127 9:127 0 349.2G 0 raid1
nvme1n1 259:3 0 1.4T 0 disk
└─md127 9:127 0 349.2G 0 raid1
Hot removal: echo 1 > /sys/bus/pci/devices/0000\:67\:00.0/remove
============
After hot removal:
===================
# nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 -1 0.00 B / 0.00 B 1 B + 0 B
/dev/nvme1n1 PHKD9325000N1P5CGN INTEL SSDPD21K015TA 1 1.50 TB / 1.50 TB 512 B + 0 B E2010453
# ll /sys/block
md127 -> ../devices/virtual/block/md127
nvme0n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys0/nvme0n1
nvme1c1n1 -> ../devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:02.0/0000:68:00.0/nvme/nvme1/nvme1c1n1
nvme1n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys1/nvme1n1
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:1 0 349.3G 0 disk
└─md127 9:127 0 349.2G 0 raid1
nvme1n1 259:3 0 1.4T 0 disk
└─md127 9:127 0 349.2G 0 raid1
I guess this is because the md-raid1 driver is still holding a reference to the nvme driver (kref count is 2 in the nvme removal path).
Is this a bug in nvme multipath? Because it works fine with multipath off.
Please let me know if more information is required :)
Thank you,
Revanth
More information about the Linux-nvme
mailing list