nvme_tcp BUG: unable to handle kernel NULL pointer dereference at 0000000000000230
Sagi Grimberg
sagi at grimberg.me
Wed Jun 9 01:04:58 PDT 2021
> Hi Sagi,
>
> Indeed RHEL8.3 does not have the mutex protection on nvme_tcp_stop_queue
> However, in our case, based on the below back trace
> We don't get to __nvme_tcp_stop_queue from nvme_tcp_stop_queue
> We get to it from:
> nvme_tcp_reconnect_ctrl_work --> nvme_tcp_setup_ctrl --> nvme_tcp_start_queue --> __nvme_tcp_stop_queue
>
> so I'm not sure how this mutex protection will help in this case
Oh, well iirc we probably need the same mutex protection in start
failure case then?
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 216d21a6a165..00dff3654e6f 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1548,6 +1548,7 @@ static void nvme_tcp_stop_queue(struct nvme_ctrl
*nctrl, int qid)
static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx)
{
struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
+ struct nvme_tcp_queue *queue = &ctrl->queues[idx];
int ret;
if (idx)
@@ -1556,10 +1557,12 @@ static int nvme_tcp_start_queue(struct nvme_ctrl
*nctrl, int idx)
ret = nvmf_connect_admin_queue(nctrl);
if (!ret) {
- set_bit(NVME_TCP_Q_LIVE, &ctrl->queues[idx].flags);
+ set_bit(NVME_TCP_Q_LIVE, &queue->flags);
} else {
- if (test_bit(NVME_TCP_Q_ALLOCATED,
&ctrl->queues[idx].flags))
- __nvme_tcp_stop_queue(&ctrl->queues[idx]);
+ mutex_lock(&queue->queue_lock);
+ if (test_bit(NVME_TCP_Q_ALLOCATED, &queue->flags))
+ __nvme_tcp_stop_queue(queue);
+ mutex_unlock(&queue->queue_lock);
dev_err(nctrl->device,
"failed to connect queue: %d ret=%d\n", idx, ret);
}
--
More information about the Linux-nvme
mailing list