nvme: subsystem handle not released on hot removal under raid1 configuration and multipath enabled

Rajashekar, Revanth revanth.rajashekar at intel.com
Mon Nov 16 19:06:54 EST 2020


Hi all,
I'm trying to run a few tests on multipath capable nvme devices under raid-1 configuration.
When I try to hot remove a device via sysfs, only the block device is removed while the subsystem handle is still in-tact.

Before hot removal:
===================
# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     PHKD9325000C375AGN   INTEL SSDPD21K375GA                      1         375.08  GB / 375.08  GB    512   B +  0 B   E2010453
/dev/nvme1n1     PHKD9325000N1P5CGN   INTEL SSDPD21K015TA                      1           1.50  TB /   1.50  TB    512   B +  0 B   E2010453

# ll /sys/block
md127 -> ../devices/virtual/block/md127
nvme0c0n1 -> ../devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:01.0/0000:67:00.0/nvme/nvme0/nvme0c0n1
nvme0n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys0/nvme0n1
nvme1c1n1 -> ../devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:02.0/0000:68:00.0/nvme/nvme1/nvme1c1n1
nvme1n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys1/nvme1n1

# lsblk
NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1       259:1    0 349.3G  0 disk
└─md127         9:127  0 349.2G  0 raid1
nvme1n1       259:3    0   1.4T  0 disk
└─md127         9:127  0 349.2G  0 raid1

Hot removal: echo 1 > /sys/bus/pci/devices/0000\:67\:00.0/remove
============

After hot removal:
===================
# nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1                                                                   -1          0.00   B /   0.00   B      1   B +  0 B
/dev/nvme1n1     PHKD9325000N1P5CGN   INTEL SSDPD21K015TA                      1           1.50  TB /   1.50  TB    512   B +  0 B   E2010453

# ll /sys/block
md127 -> ../devices/virtual/block/md127
nvme0n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys0/nvme0n1
nvme1c1n1 -> ../devices/pci0000:64/0000:64:00.0/0000:65:00.0/0000:66:02.0/0000:68:00.0/nvme/nvme1/nvme1c1n1
nvme1n1 -> ../devices/virtual/nvme-subsystem/nvme-subsys1/nvme1n1

# lsblk
NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
nvme0n1       259:1    0 349.3G  0 disk
└─md127         9:127  0 349.2G  0 raid1
nvme1n1       259:3    0   1.4T  0 disk
└─md127         9:127  0 349.2G  0 raid1

I guess this is because the md-raid1 driver is still holding a reference to the nvme driver (kref count is 2 in the nvme removal path).
Is this a bug in nvme multipath? Because it works fine with multipath off.
Please let me know if more information is required :)

Thank you,
Revanth




More information about the Linux-nvme mailing list