kernull NULL pointer observed on initiator side after 'nvmetcli clear' on target side
Yi Zhang
yizhan at redhat.com
Sun Mar 5 05:39:44 PST 2017
Hi experts
If I offline one CPU on initiator side and nvmetcli clear on target side, it will cause kernel NULL pointer on initiator side, could you help check it, thanks
Steps to reproduce:
1. setup nvmet target with null-blk device:
#modprobe nvmet
#modprobe nvmet-rdma
#modprobe null_blk nr_devices=1
#nvmetcli restore rdma.json
2. connect the target on initiator side and offline one cpu:
#modprobe nvme-rdma
#nvme connect-all -t rdma -a 172.31.2.3 -s 1023
#echo 0 > /sys/devices/system/cpu/cpu1/online
3. nvmetcli clear on target side
#nvmetcli clear
Kernel log:
[ 125.039340] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.2.3:1023
[ 125.160587] nvme nvme0: creating 16 I/O queues.
[ 125.602244] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.2.3:1023
[ 140.930343] Broke affinity for irq 16
[ 140.950295] Broke affinity for irq 28
[ 140.969957] Broke affinity for irq 70
[ 140.986584] Broke affinity for irq 90
[ 141.003160] Broke affinity for irq 93
[ 141.019779] Broke affinity for irq 97
[ 141.036341] Broke affinity for irq 100
[ 141.053782] Broke affinity for irq 104
[ 141.072860] smpboot: CPU 1 is now offline
[ 154.768104] nvme nvme0: reconnecting in 10 seconds
[ 165.349689] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 165.387783] IP: blk_mq_reinit_tagset+0x35/0x80
[ 165.409550] PGD 0
[ 165.409550]
[ 165.427269] Oops: 0000 [#1] SMP
[ 165.442876] Modules linked in: nvme_rdma nvme_fabrics nvme_core xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl ipmi_ssif sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf iTCO_wdt ipmi_si iTCO_vendor_support wmi hpwdt pcspkr sg ipmi_devintf hpilo
[ 165.769732] acpi_power_meter ipmi_msghandler ioatdma shpchp acpi_cpufreq lpc_ich dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs mlx4_en sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ata_generic fb_sys_fops pata_acpi ttm bnx2x drm e1000e ata_piix mdio ptp mlx4_core i2c_core serio_raw libata pps_core hpsa libcrc32c devlink fjes scsi_transport_sas crc32c_intel dm_mirror dm_region_hash dm_log dm_mod
[ 165.957288] CPU: 6 PID: 424 Comm: kworker/6:2 Not tainted 4.10.0+ #3
[ 165.985856] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 12/20/2013
[ 166.015576] Workqueue: nvme_rdma_wq nvme_rdma_reconnect_ctrl_work [nvme_rdma]
[ 166.047813] task: ffff8804291f9680 task.stack: ffffc90004fa4000
[ 166.074543] RIP: 0010:blk_mq_reinit_tagset+0x35/0x80
[ 166.096784] RSP: 0018:ffffc90004fa7e00 EFLAGS: 00010246
[ 166.120205] RAX: ffff88082a97f600 RBX: 0000000000000000 RCX: 000000018020001a
[ 166.152099] RDX: 0000000000000001 RSI: ffff88042c1b5240 RDI: ffff88042c163680
[ 166.183997] RBP: ffffc90004fa7e20 R08: ffff88042c388400 R09: 000000018020001a
[ 166.216018] R10: 000000002c388801 R11: ffff88042c388400 R12: 0000000000000000
[ 166.248248] R13: 0000000000000001 R14: ffff8804be65d018 R15: 0000000000000180
[ 166.280594] FS: 0000000000000000(0000) GS:ffff88042f780000(0000) knlGS:0000000000000000
[ 166.317022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 166.342821] CR2: 0000000000000000 CR3: 0000000001c09000 CR4: 00000000000406e0
[ 166.374899] Call Trace:
[ 166.385854] nvme_rdma_reconnect_ctrl_work+0x60/0x1f0 [nvme_rdma]
[ 166.414954] process_one_work+0x165/0x410
[ 166.434888] worker_thread+0x137/0x4c0
[ 166.453275] kthread+0x101/0x140
[ 166.469530] ? rescuer_thread+0x3b0/0x3b0
[ 166.487549] ? kthread_park+0x90/0x90
[ 166.503966] ret_from_fork+0x2c/0x40
[ 166.520071] Code: 56 49 89 fe 41 55 41 54 53 48 8b 47 08 48 83 78 40 00 74 55 8b 57 10 85 d2 74 4e 45 31 ed 49 8b 46 38 49 63 d5 31 db 4c 8b 24 d0 <41> 8b 04 24 85 c0 74 2c 49 8b 84 24 80 00 00 00 48 63 d3 48 8b
[ 166.605127] RIP: blk_mq_reinit_tagset+0x35/0x80 RSP: ffffc90004fa7e00
[ 166.634093] CR2: 0000000000000000
[ 166.648963] ---[ end trace cabb6f7f7f9f7187 ]---
[ 166.674180] Kernel panic - not syncing: Fatal exception
[ 166.697717] Kernel Offset: disabled
[ 166.717719] ---[ end Kernel panic - not syncing: Fatal exception
[ 166.746440] ------------[ cut here ]------------
[ 166.767150] WARNING: CPU: 6 PID: 424 at arch/x86/kernel/smp.c:127 native_smp_send_reschedule+0x3f/0x50
[ 166.808742] Modules linked in: nvme_rdma nvme_fabrics nvme_core xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx4_ib ib_core intel_rapl ipmi_ssif sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf iTCO_wdt ipmi_si iTCO_vendor_support wmi hpwdt pcspkr sg ipmi_devintf hpilo
[ 167.131981] acpi_power_meter ipmi_msghandler ioatdma shpchp acpi_cpufreq lpc_ich dca nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs mlx4_en sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ata_generic fb_sys_fops pata_acpi ttm bnx2x drm e1000e ata_piix mdio ptp mlx4_core i2c_core serio_raw libata pps_core hpsa libcrc32c devlink fjes scsi_transport_sas crc32c_intel dm_mirror dm_region_hash dm_log dm_mod
[ 167.315426] CPU: 6 PID: 424 Comm: kworker/6:2 Tainted: G D 4.10.0+ #3
[ 167.349430] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 12/20/2013
[ 167.379147] Workqueue: nvme_rdma_wq nvme_rdma_reconnect_ctrl_work [nvme_rdma]
[ 167.411437] Call Trace:
[ 167.422486] <IRQ>
[ 167.432587] dump_stack+0x63/0x87
[ 167.449042] __warn+0xd1/0xf0
[ 167.463891] warn_slowpath_null+0x1d/0x20
[ 167.483697] native_smp_send_reschedule+0x3f/0x50
[ 167.506498] resched_curr+0xa1/0xc0
[ 167.522992] check_preempt_curr+0x70/0x90
[ 167.541625] ttwu_do_wakeup+0x19/0xe0
[ 167.559098] ttwu_do_activate+0x6f/0x80
[ 167.577357] try_to_wake_up+0x1aa/0x3b0
[ 167.594742] ? select_idle_sibling+0x2c/0x3d0
[ 167.614498] default_wake_function+0x12/0x20
[ 167.633655] __wake_up_common+0x55/0x90
[ 167.650534] __wake_up_locked+0x13/0x20
[ 167.667784] ep_poll_callback+0xbb/0x240
[ 167.685405] __wake_up_common+0x55/0x90
[ 167.702615] __wake_up+0x39/0x50
[ 167.717046] wake_up_klogd_work_func+0x40/0x60
[ 167.736993] irq_work_run_list+0x4d/0x70
[ 167.755647] ? tick_sched_do_timer+0x70/0x70
[ 167.776239] irq_work_tick+0x40/0x50
[ 167.792914] update_process_times+0x42/0x60
[ 167.812138] tick_sched_handle.isra.18+0x25/0x60
[ 167.833794] tick_sched_timer+0x3d/0x70
[ 167.851391] __hrtimer_run_queues+0xf3/0x280
[ 167.871180] hrtimer_interrupt+0xa8/0x1a0
[ 167.889854] local_apic_timer_interrupt+0x35/0x60
[ 167.912036] smp_apic_timer_interrupt+0x38/0x50
[ 167.933375] apic_timer_interrupt+0x93/0xa0
[ 167.954586] RIP: 0010:panic+0x1f5/0x239
[ 167.974032] RSP: 0018:ffffc90004fa7b50 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[ 168.009365] RAX: 0000000000000034 RBX: 0000000000000000 RCX: 0000000000000006
[ 168.041566] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffff88042f78e000
[ 168.073801] RBP: ffffc90004fa7bc0 R08: 00000000fffffffe R09: 00000000000004d9
[ 168.105833] R10: 0000000000000005 R11: 00000000000004d8 R12: ffffffff81a0e2e1
[ 168.137892] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000046
[ 168.170234] </IRQ>
[ 168.179603] oops_end+0xb8/0xd0
[ 168.193685] no_context+0x19e/0x3f0
[ 168.209369] ? lock_timer_base+0xa0/0xa0
[ 168.227067] __bad_area_nosemaphore+0xee/0x1d0
[ 168.246978] bad_area_nosemaphore+0x14/0x20
[ 168.266108] __do_page_fault+0x89/0x4a0
[ 168.283345] ? __slab_free+0x9b/0x2c0
[ 168.299742] do_page_fault+0x30/0x80
[ 168.315903] page_fault+0x28/0x30
[ 168.330741] RIP: 0010:blk_mq_reinit_tagset+0x35/0x80
[ 168.353028] RSP: 0018:ffffc90004fa7e00 EFLAGS: 00010246
[ 168.376493] RAX: ffff88082a97f600 RBX: 0000000000000000 RCX: 000000018020001a
[ 168.408373] RDX: 0000000000000001 RSI: ffff88042c1b5240 RDI: ffff88042c163680
[ 168.440447] RBP: ffffc90004fa7e20 R08: ffff88042c388400 R09: 000000018020001a
[ 168.476491] R10: 000000002c388801 R11: ffff88042c388400 R12: 0000000000000000
[ 168.510913] R13: 0000000000000001 R14: ffff8804be65d018 R15: 0000000000000180
[ 168.543964] nvme_rdma_reconnect_ctrl_work+0x60/0x1f0 [nvme_rdma]
[ 168.571458] process_one_work+0x165/0x410
[ 168.589496] worker_thread+0x137/0x4c0
[ 168.606267] kthread+0x101/0x140
[ 168.620712] ? rescuer_thread+0x3b0/0x3b0
[ 168.638747] ? kthread_park+0x90/0x90
[ 168.655224] ret_from_fork+0x2c/0x40
[ 168.671278] ---[ end trace cabb6f7f7f9f7188 ]---
Best Regards,
Yi Zhang
More information about the Linux-nvme
mailing list