crash on device removal

Steve Wise swise at opengridcomputing.com
Tue Jul 12 09:34:54 PDT 2016


Hey Christoph, 

I see a crash when shutting down a nvme host node via 'reboot' that has 1 target
device attached.  The shutdown causes iw_cxgb4 to be removed which triggers the
device removal logic in the nvmf rdma transport.  The crash is here:

(gdb) list *nvme_rdma_free_qe+0x18
0x1e8 is in nvme_rdma_free_qe (drivers/nvme/host/rdma.c:196).
191     }
192
193     static void nvme_rdma_free_qe(struct ib_device *ibdev, struct
nvme_rdma_qe *qe,
194                     size_t capsule_size, enum dma_data_direction dir)
195     {
196             ib_dma_unmap_single(ibdev, qe->dma, capsule_size, dir);
197             kfree(qe->data);
198     }
199
200     static int nvme_rdma_alloc_qe(struct ib_device *ibdev, struct
nvme_rdma_qe *qe,

Apparently qe is NULL.

Looking at the device removal path, the logic appears correct (see
nvme_rdma_device_unplug() and the nice function comment :) ).  I'm wondering if
concurrently to the host device removal path cleaning up queues, the target is
disconnecting all of its queues due to the first disconnect event from the host
causing some cleanup race on the host side?  Although since the removal path
executing in the cma event handler upcall, I don't think another thread would be
handling a disconnect event.  Maybe the qp async event handler flow?

Thoughts?

Here is the Oops:

[  710.929451] iw_cxgb4:0000:83:00.4: Detach
[  711.242989] iw_cxgb4:0000:82:00.4: Detach
[  711.247039] nvme nvme1: Got rdma device removal event, deleting ctrl
[  711.298244] BUG: unable to handle kernel NULL pointer dereference at
0000000000000010
[  711.306162] IP: [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.313286] PGD 0
[  711.315348] Oops: 0000 [#1] SMP
[  711.318519] Modules linked in: nvme_rdma nvme_fabrics brd iw_cxgb4 cxgb4
ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT nf_reject_ipv4 xt_CHECKSUM
iptable_mangle iptable_filter ip_tables bridge 8021q mrp garp stp llc cachefiles
fscache rdma_ucm rdma_cm iw_cm ib_ipoib ib_cm ib_uverbs ib_umad ocrdma be2net
iw_nes libcrc32c iw_cxgb3 cxgb3 mdio ib_qib rdmavt mlx5_ib mlx5_core mlx4_en
ib_mthca binfmt_misc dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan
vhost tun kvm irqbypass uinput iTCO_wdt iTCO_vendor_support mxm_wmi pcspkr
mlx4_ib ib_core mlx4_core dm_mod i2c_i801 sg ipmi_ssif ipmi_si ipmi_msghandler
nvme nvme_core lpc_ich mfd_core mei_me mei igb dca ptp pps_core wmi ext4(E)
mbcache(E) jbd2(E) sd_mod(E) ahci(E) libahci(E) libata(E) mgag200(E) ttm(E)
drm_kms_helper(E) drm(E) fb_sys_fops(E) sysimgblt(E) sysfillrect(E)
syscopyarea(E) i2c_algo_bit(E) i2c_core(E) [last unloaded: cxgb4]
[  711.412158] CPU: 0 PID: 4213 Comm: reboot Tainted: G            E
4.7.0-rc2-block-for-next+ #77
[  711.421064] Hardware name: Supermicro X9DR3-F/X9DR3-F, BIOS 3.2a 07/09/2015
[  711.428058] task: ffff881033b495c0 ti: ffff88100fc24000 task.ti:
ffff88100fc24000
[  711.435563] RIP: 0010:[<ffffffffa039a1e8>]  [<ffffffffa039a1e8>]
nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.445104] RSP: 0018:ffff88100fc279a8  EFLAGS: 00010292
[  711.450442] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[  711.457608] RDX: 0000000000000010 RSI: 0000000000000000 RDI: ffff881034168000
[  711.464775] RBP: ffff88100fc279b8 R08: 0000000000000001 R09: ffffea0001e51d10
[  711.471943] R10: ffffea0001e51d18 R11: 0000000000000000 R12: 0000000000000000
[  711.479112] R13: 0000000000000020 R14: ffff881034168000 R15: ffff8810345b8140
[  711.486285] FS:  00007feac7042700(0000) GS:ffff88103ee00000(0000)
knlGS:0000000000000000
[  711.494405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  711.500175] CR2: 0000000000000010 CR3: 00000010229d7000 CR4: 00000000000406f0
[  711.507341] Stack:
[  711.509367]  ffff881034285000 0000000000000001 ffff88100fc279f8
ffffffffa039adcf
[  711.516868]  ffff88100fc279d8 ffff881034285000 ffff881037f9f000
ffff881034272c00
[  711.524384]  ffff88100fc27b18 ffff881034272dd8 ffff88100fc27a88
ffffffffa039c8f5
[  711.531897] Call Trace:
[  711.534371]  [<ffffffffa039adcf>] nvme_rdma_destroy_queue_ib+0x5f/0x90
[nvme_rdma]
[  711.541972]  [<ffffffffa039c8f5>] nvme_rdma_cm_handler+0x2c5/0x340
[nvme_rdma]
[  711.549228]  [<ffffffff811ff71d>] ? kmem_cache_free+0x1dd/0x200
[  711.555177]  [<ffffffffa070e669>] ? cma_comp+0x49/0x60 [rdma_cm]
[  711.561217]  [<ffffffffa071310f>] cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
[  711.567860]  [<ffffffffa07131d7>] cma_process_remove+0xb7/0x100 [rdma_cm]
[  711.574678]  [<ffffffff812a4de4>] ? __kernfs_remove+0x114/0x1d0
[  711.580626]  [<ffffffffa071325e>] cma_remove_one+0x3e/0x60 [rdma_cm]
[  711.587015]  [<ffffffffa03b8ca0>] ib_unregister_device+0xb0/0x150 [ib_core]
[  711.595252]  [<ffffffffa0816034>] c4iw_unregister_device+0x64/0x90 [iw_cxgb4]
[  711.603648]  [<ffffffffa0809357>] c4iw_remove+0x27/0x60 [iw_cxgb4]
[  711.611069]  [<ffffffffa080a061>] c4iw_uld_state_change+0x111/0x250
[iw_cxgb4]
[  711.619532]  [<ffffffff816da18d>] ? _cond_resched+0x1d/0x30
[  711.626317]  [<ffffffff81371971>] ? list_del+0x11/0x40
[  711.632678]  [<ffffffffa07ce71a>] detach_ulds+0x4a/0xf0 [cxgb4]
[  711.639822]  [<ffffffffa07ce94d>] remove_one+0x18d/0x1b0 [cxgb4]
[  711.647060]  [<ffffffff81397c21>] pci_device_shutdown+0x41/0x90
[  711.654189]  [<ffffffff814861f5>] device_shutdown+0x45/0x1b0
[  711.661051]  [<ffffffff810ac746>] kernel_restart_prepare+0x36/0x40
[  711.668414]  [<ffffffff810ac8c6>] kernel_restart+0x16/0x60
[  711.675084]  [<ffffffff810acb15>] SYSC_reboot+0x1a5/0x230
[  711.681645]  [<ffffffff81245ad1>] ? mntput+0x21/0x30
[  711.687738]  [<ffffffff812267a7>] ? __fput+0x177/0x240
[  711.693964]  [<ffffffff8122691e>] ? ____fput+0xe/0x10
[  711.700097]  [<ffffffff81003476>] ? do_audit_syscall_entry+0x66/0x70
[  711.707481]  [<ffffffff81003578>] ? syscall_trace_enter_phase1+0xf8/0x120
[  711.715273]  [<ffffffff81003344>] ? exit_to_usermode_loop+0x74/0xf0
[  711.722514]  [<ffffffff810acbae>] SyS_reboot+0xe/0x10
[  711.728517]  [<ffffffff81003f08>] do_syscall_64+0x78/0x1d0
[  711.734931]  [<ffffffff8106e327>] ? do_page_fault+0x37/0x90
[  711.741410]  [<ffffffff816ddee1>] entry_SYSCALL64_slow_path+0x25/0x25
[  711.748731] Code: 01 00 00 c9 c3 0f 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 55
48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 f0 02 00 00 48 89 f3 <48> 8b 76
10 48 85 c0 74 13 ff 50 10 48 8b 7b 08 e8 93 4d e6 e0
[  711.770832] RIP  [<ffffffffa039a1e8>] nvme_rdma_free_qe+0x18/0x80 [nvme_rdma]
[  711.778904]  RSP <ffff88100fc279a8>
[  711.783290] CR2: 0000000000000010




More information about the Linux-nvme mailing list