[BUG]LTS 5.10 regression NULL pointer deref in nvme_ioctl

Jinpu Wang jinpu.wang at ionos.com
Wed Jun 8 01:02:30 PDT 2022


Hi folks on nvme list,

We hitt the following crash when run "nvme list" on kernel 5.10.115:

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/arch/x86/kernel/dumpstack.c:
359

 #4 [ffffa80c7a5c7d10] no_context at ffffffffa7062467

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/arch/x86/mm/fault.c: 754

 #5 [ffffa80c7a5c7d78] exc_page_fault at ffffffffa77c919e

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/arch/x86/mm/fault.c: 1327

 #6 [ffffa80c7a5c7dd0] asm_exc_page_fault at ffffffffa7800a6e

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/./arch/x86/include/asm/idtentry.h:
571

 #7 [ffffa80c7a5c7e58] nvme_ioctl at ffffffffc10c1038 [nvme_core]

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/drivers/nvme/host/nvme.h: 609

 #8 [ffffa80c7a5c7ec0] blkdev_ioctl at ffffffffa73f6bc5

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/block/ioctl.c: 237

 #9 [ffffa80c7a5c7f08] block_ioctl at ffffffffa72c3329

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/fs/block_dev.c: 1893

#10 [ffffa80c7a5c7f10] __x64_sys_ioctl at ffffffffa7292fe4

    /build/ionos-linux-pOGXys/ionos-linux-5.10.42/fs/ioctl.c: 49



607 static inline void nvme_get_ctrl(struct nvme_ctrl *ctrl)

608 {

609         get_device(ctrl->device);

610 }
The same commands works on kernel 5.4.

kernel 5.10 panic while processing nvme2 which is  a bit different.

some more info regarding the disks:
$ sudo nvme list
Node             SN                   Model
        Namespace Usage                      Format           FW Rev
---------------- --------------------
---------------------------------------- ---------
-------------------------- ---------------- --------
/dev/nvme0n1     S4BGNC0R600336       SAMSUNG MZQLB7T6HMLA-00007
        1           1,39  GB /   6,40  TB    512   B +  0 B   EDB5202Q
/dev/nvme1n1     S4BGNC0R600346       SAMSUNG MZQLB7T6HMLA-00007
        1           6,40  TB /   6,40  TB    512   B +  0 B   EDB5202Q
/dev/nvme2n1     S4BGNC0R600335       SAMSUNG MZQLB7T6HMLA-00007
        1           1,37  GB /   6,40  TB    512   B +  0 B   EDB5202Q
/dev/nvme3n1     S546NE0R600295       SAMSUNG MZWLJ7T6HALA-00007
        1           7,68  TB /   7,68  TB    512   B +  0 B   EPK9AB5Q

sudo nvme list-ctrl /dev/nvme2

[   0]:0x1

[   1]:0x2

[   2]:0x3

[   3]:0x4

[   4]:0x5

[   5]:0x6

[   6]:0x7

[   7]:0x8

[   8]:0x9

[   9]:0xa

[  10]:0xb

[  11]:0xc

[  12]:0xd

[  13]:0xe

[  14]:0xf

[  15]:0x10

[  16]:0x11

[  17]:0x12

[  18]:0x13

[  19]:0x14

[  20]:0x15

[  21]:0x16

[  22]:0x17

[  23]:0x18

[  24]:0x19

[  25]:0x1a

[  26]:0x1b

[  27]:0x1c

[  28]:0x1d

[  29]:0x1e

[  30]:0x1f

[  31]:0x20

[  32]:0x21

[  33]:0x22

[  34]:0x23

[  35]:0x24

[  36]:0x25

[  37]:0x26

[  38]:0x27

[  39]:0x28

[  40]:0x29

[  41]:0x2a

[  42]:0x2b

[  43]:0x2c

[  44]:0x2d

[  45]:0x2e

[  46]:0x2f

[  47]:0x30

[  48]:0x31

[  49]:0x32

[  50]:0x33

[  51]:0x34

[  52]:0x35

[  53]:0x36

[  54]:0x37

[  55]:0x38

[  56]:0x39

[  57]:0x3a

[  58]:0x3b

[  59]:0x3c

[  60]:0x3d

[  61]:0x3e

[  62]:0x3f

[  63]:0x40

[  64]:0x41

[  65]:0x42

$ sudo nvme list-ctrl /dev/nvme0

[   0]:0x4
The Samsung pm1733 marks the 0x0041 controller as the active master
controller, we only attached one namespace to controller 0x41.

sudo nvme attach-ns /dev/nvme2 -n 1 -c 0x41

attach-ns: Success, nsid:1

$ sudo nvme list-ns /dev/nvme2

[   0]:0x1

For both kernel CONFIG_NVME_MULTIPATH=y is set.

As it's a production server, it's hard to do git bisect, I was hoping
if anyone have an idea what went wrong here, my guess is the namespace
handingly somehow changed.

Thanks!
Jinpu Wang @ IONOS



More information about the Linux-nvme mailing list