[PATCH] nvme: tcp: avoid race between lock and destroy of queue_lock

Thu Sep 12 06:28:35 PDT 2024

On Sep 12, 2024 / 13:20, Sagi Grimberg wrote:
> 
> 
> 
> On 12/09/2024 12:16, Hannes Reinecke wrote:
> > On 9/12/24 08:59, Sagi Grimberg wrote:
> > > 
> > > 
> > > 
> > > On 12/09/2024 9:27, Shin'ichiro Kawasaki wrote:
[...]
> > > 
> > > This is really not needed.. You can already serialize with:
> > > -- 
> > > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> > > index a2a47d3ab99f..91c19d06cae9 100644
> > > --- a/drivers/nvme/host/tcp.c
> > > +++ b/drivers/nvme/host/tcp.c
> > > @@ -1365,6 +1365,10 @@ static void nvme_tcp_free_queue(struct
> > > nvme_ctrl *nctrl, int qid)
> > >          if (!test_and_clear_bit(NVME_TCP_Q_ALLOCATED, &queue->flags))
> > >                  return;
> > > 
> > > +       /* make sure any concurrent .get_address completes */
> > > +       mutex_lock(&queue->queue_lock);
> > > +       mutex_unlock(&queue->queue_lock);
> > > +
> > >          if (queue->hdr_digest || queue->data_digest)
> > >                  nvme_tcp_free_crypto(queue);

I applied this change above only on top of the kernel v6.11-rc7. Unfortunately,
the WARN was still observed.

> > 
> > Wouldn't it be sufficient to check for NVME_TCP_Q_LIVE outside of the
> > mutex?
> 
> I think its intentional, so this serializes against nvme_tcp_stop_queue()

IIUC, NVME_TCP_Q_LIVE in the mutex region is required in nvme_tcp_free_queue(),
and we can not simply move it out of the region.

As a similar idea, I can think of doing the NVME_TCP_Q_LIVE check both in and
out of the mutex region. The change below does it by adding the check out of the
region:

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index a2a47d3ab99f..ba91f98e90f7 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2617,6 +2617,9 @@ static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size)
 
 	len = nvmf_get_address(ctrl, buf, size);
 
+	if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))
+		return len;
+
 	mutex_lock(&queue->queue_lock);
 
 	if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))

With this change, I repeated the test case 400 times and the WARN was not
observed. Good, this fix might be enough.

Actually, once I had tried the similar fix using the NMVE_TCP_Q_ALLOCATED
flag instead of nvme_TCP_Q_LIVE. In that case, the WARN had not been observed
during the 400 times repeat. Both showed good result, but as Hannes noted, I
agree that the check by NVME_TCP_Q_LIVE is the better because NVME_TCP_Q_LIVE
is cleared earlier than NVME_TCP_Q_ALLOCATED, and there will be less chance the
queue_lock gets destroyed before the mutex_lock().

Having said that, I still think there is a very tiny possibility that the
queue_lock gets destroyed between the NVME_TCP_Q_LIVE flag check and the
mutex_lock(). That is why I suggested the new flag NVME_TCP_Q_REFERRED.
Am I overthinking?