lockdep warning: fs_reclaim_acquire vs tcp_sendpage

Thu Oct 20 09:20:13 PDT 2022

>> Just for the experiment, can you try with this change:
> 
> Good call, this seems to do the trick. The splat is gone with it.

OK, it doesn't say much because it is just one of many conditions
that can make a socket release to allocate an skb and send a tcp
RST, which can happen under memory pressure.

It's also not a great option to set a minimum linger of 1, which means
that if the controller is not accessible, we can block for 1 second
per queue, which is awful.

Does this change also make the issue go away?
--

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c5bea92560bd..5bae8914c861 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1300,6 +1300,7 @@ static void nvme_tcp_free_queue(struct nvme_ctrl 
*nctrl, int qid)
         struct page *page;
         struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
         struct nvme_tcp_queue *queue = &ctrl->queues[qid];
+       unsigned int noreclaim_flag;

         if (!test_and_clear_bit(NVME_TCP_Q_ALLOCATED, &queue->flags))
                 return;
@@ -1312,7 +1313,11 @@ static void nvme_tcp_free_queue(struct nvme_ctrl 
*nctrl, int qid)
                 __page_frag_cache_drain(page, 
queue->pf_cache.pagecnt_bias);
                 queue->pf_cache.va = NULL;
         }
+
+       noreclaim_flag = memalloc_noreclaim_save();
         sock_release(queue->sock);
+       memalloc_noreclaim_restore(noreclaim_flag);
+
         kfree(queue->pdu);
         mutex_destroy(&queue->send_mutex);
         mutex_destroy(&queue->queue_lock);
--