kernull NULL pointer observed on initiator side after 'nvmetcli clear' on target side

Sagi Grimberg sagi at grimberg.me
Mon Mar 6 03:25:39 PST 2017


> Hi experts
>
> If I offline one CPU on initiator side and nvmetcli clear on target side, it will cause kernel NULL pointer on initiator side, could you help check it, thanks
>
> Steps to reproduce:
> 1. setup nvmet target with null-blk device:
> #modprobe nvmet
> #modprobe nvmet-rdma
> #modprobe null_blk nr_devices=1
> #nvmetcli restore rdma.json
>
> 2. connect the target on initiator side and offline one cpu:
> #modprobe nvme-rdma
> #nvme connect-all -t rdma -a 172.31.2.3 -s 1023
> #echo 0 > /sys/devices/system/cpu/cpu1/online
>
> 3. nvmetcli clear on target side
> #nvmetcli clear
>
> Kernel log:
>
> [  125.039340] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.2.3:1023
> [  125.160587] nvme nvme0: creating 16 I/O queues.
> [  125.602244] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.2.3:1023
> [  140.930343] Broke affinity for irq 16
> [  140.950295] Broke affinity for irq 28
> [  140.969957] Broke affinity for irq 70
> [  140.986584] Broke affinity for irq 90
> [  141.003160] Broke affinity for irq 93
> [  141.019779] Broke affinity for irq 97
> [  141.036341] Broke affinity for irq 100
> [  141.053782] Broke affinity for irq 104
> [  141.072860] smpboot: CPU 1 is now offline
> [  154.768104] nvme nvme0: reconnecting in 10 seconds
> [  165.349689] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [  165.387783] IP: blk_mq_reinit_tagset+0x35/0x80

Looks like blk_mq_reinit_tagset is not aware that tags can go away with
cpu hotplug...

Does this fix your issue:
--
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index e48bc2c72615..9d97bfc4d465 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -295,6 +295,9 @@ int blk_mq_reinit_tagset(struct blk_mq_tag_set *set)
         for (i = 0; i < set->nr_hw_queues; i++) {
                 struct blk_mq_tags *tags = set->tags[i];

+               if (!tags)
+                       continue;
+
                 for (j = 0; j < tags->nr_tags; j++) {
                         if (!tags->static_rqs[j])
                                 continue;
--



More information about the Linux-nvme mailing list