[BUG]LTS 5.10 regression NULL pointer deref in nvme_ioctl
Jinpu Wang
jinpu.wang at ionos.com
Wed Jun 8 01:02:30 PDT 2022
Hi folks on nvme list,
We hitt the following crash when run "nvme list" on kernel 5.10.115:
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/arch/x86/kernel/dumpstack.c:
359
#4 [ffffa80c7a5c7d10] no_context at ffffffffa7062467
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/arch/x86/mm/fault.c: 754
#5 [ffffa80c7a5c7d78] exc_page_fault at ffffffffa77c919e
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/arch/x86/mm/fault.c: 1327
#6 [ffffa80c7a5c7dd0] asm_exc_page_fault at ffffffffa7800a6e
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/./arch/x86/include/asm/idtentry.h:
571
#7 [ffffa80c7a5c7e58] nvme_ioctl at ffffffffc10c1038 [nvme_core]
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/drivers/nvme/host/nvme.h: 609
#8 [ffffa80c7a5c7ec0] blkdev_ioctl at ffffffffa73f6bc5
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/block/ioctl.c: 237
#9 [ffffa80c7a5c7f08] block_ioctl at ffffffffa72c3329
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/fs/block_dev.c: 1893
#10 [ffffa80c7a5c7f10] __x64_sys_ioctl at ffffffffa7292fe4
/build/ionos-linux-pOGXys/ionos-linux-5.10.42/fs/ioctl.c: 49
607 static inline void nvme_get_ctrl(struct nvme_ctrl *ctrl)
608 {
609 get_device(ctrl->device);
610 }
The same commands works on kernel 5.4.
kernel 5.10 panic while processing nvme2 which is a bit different.
some more info regarding the disks:
$ sudo nvme list
Node SN Model
Namespace Usage Format FW Rev
---------------- --------------------
---------------------------------------- ---------
-------------------------- ---------------- --------
/dev/nvme0n1 S4BGNC0R600336 SAMSUNG MZQLB7T6HMLA-00007
1 1,39 GB / 6,40 TB 512 B + 0 B EDB5202Q
/dev/nvme1n1 S4BGNC0R600346 SAMSUNG MZQLB7T6HMLA-00007
1 6,40 TB / 6,40 TB 512 B + 0 B EDB5202Q
/dev/nvme2n1 S4BGNC0R600335 SAMSUNG MZQLB7T6HMLA-00007
1 1,37 GB / 6,40 TB 512 B + 0 B EDB5202Q
/dev/nvme3n1 S546NE0R600295 SAMSUNG MZWLJ7T6HALA-00007
1 7,68 TB / 7,68 TB 512 B + 0 B EPK9AB5Q
sudo nvme list-ctrl /dev/nvme2
[ 0]:0x1
[ 1]:0x2
[ 2]:0x3
[ 3]:0x4
[ 4]:0x5
[ 5]:0x6
[ 6]:0x7
[ 7]:0x8
[ 8]:0x9
[ 9]:0xa
[ 10]:0xb
[ 11]:0xc
[ 12]:0xd
[ 13]:0xe
[ 14]:0xf
[ 15]:0x10
[ 16]:0x11
[ 17]:0x12
[ 18]:0x13
[ 19]:0x14
[ 20]:0x15
[ 21]:0x16
[ 22]:0x17
[ 23]:0x18
[ 24]:0x19
[ 25]:0x1a
[ 26]:0x1b
[ 27]:0x1c
[ 28]:0x1d
[ 29]:0x1e
[ 30]:0x1f
[ 31]:0x20
[ 32]:0x21
[ 33]:0x22
[ 34]:0x23
[ 35]:0x24
[ 36]:0x25
[ 37]:0x26
[ 38]:0x27
[ 39]:0x28
[ 40]:0x29
[ 41]:0x2a
[ 42]:0x2b
[ 43]:0x2c
[ 44]:0x2d
[ 45]:0x2e
[ 46]:0x2f
[ 47]:0x30
[ 48]:0x31
[ 49]:0x32
[ 50]:0x33
[ 51]:0x34
[ 52]:0x35
[ 53]:0x36
[ 54]:0x37
[ 55]:0x38
[ 56]:0x39
[ 57]:0x3a
[ 58]:0x3b
[ 59]:0x3c
[ 60]:0x3d
[ 61]:0x3e
[ 62]:0x3f
[ 63]:0x40
[ 64]:0x41
[ 65]:0x42
$ sudo nvme list-ctrl /dev/nvme0
[ 0]:0x4
The Samsung pm1733 marks the 0x0041 controller as the active master
controller, we only attached one namespace to controller 0x41.
sudo nvme attach-ns /dev/nvme2 -n 1 -c 0x41
attach-ns: Success, nsid:1
$ sudo nvme list-ns /dev/nvme2
[ 0]:0x1
For both kernel CONFIG_NVME_MULTIPATH=y is set.
As it's a production server, it's hard to do git bisect, I was hoping
if anyone have an idea what went wrong here, my guess is the namespace
handingly somehow changed.
Thanks!
Jinpu Wang @ IONOS
More information about the Linux-nvme
mailing list