Leaking minor device numbers
Daniel Wagner
dwagner at suse.de
Fri Oct 7 05:37:10 PDT 2022
I've noticed the nvme subsystem is leaking minor device numbers:
# nvme connect-all
brw-rw---- 1 root disk 259, 3 Oct 6 18:58 /dev/nvme2n1
brw-rw---- 1 root disk 259, 5 Oct 6 18:58 /dev/nvme2n2
brw-rw---- 1 root disk 259, 7 Oct 6 18:58 /dev/nvme2n3
brw-rw---- 1 root disk 259, 9 Oct 6 18:58 /dev/nvme2n4
# while true; do nvme connect-all; nvme disconnect-all; done
brw-rw---- 1 root disk 259, 7913 Oct 6 19:13 /dev/nvme2n1
brw-rw---- 1 root disk 259, 7929 Oct 6 19:13 /dev/nvme2n2
brw-rw---- 1 root disk 259, 7943 Oct 6 19:13 /dev/nvme2n3
brw-rw---- 1 root disk 259, 7945 Oct 6 19:13 /dev/nvme2n4
After a bit printk debugging I found the source of the
leak. nvme_alloc_ns() and nvme_alloc_ns_head() (nvme_mpath_set_live)
will allocate for each disk via blk_alloc_ext_minor() a minor device
number, though using two different interfaces.
nvme_mpath_set_live() will uses device_add_disk() which is also setting
the major number to 259/BLOCK_EXT_MAJOR. This allocation will be freed.
Though the other allocation happens via blk_mq_alloc_disk() which
doesn't set the major number to 259/BOCK_EXT_MAJOR and eventually when
the disk is freed, we end up in bdev_free_inode() and do not call
blk_free_ext_minor():
static void bdev_free_inode(struct inode *inode)
{
[...]
if (MAJOR(bdev->bd_dev) == BLOCK_EXT_MAJOR)
blk_free_ext_minor(MINOR(bdev->bd_dev));
[...]
}
I am a bit lost what is supposed to happen or how those things are
connected together. Any ideas?
More information about the Linux-nvme
mailing list