[PATCH] nvmet-rdma: Suppress a class of lockdep complaints

Sagi Grimberg sagi at grimberg.me
Tue May 9 16:24:17 PDT 2023


>> Bart, thank you very much for this immediate action after the 
>> discussion at LSF.
>> This is encouraging. I applied the patch on top of v6.4-rc1 and ran 
>> the test
>> case with various transports. Unfortunately, I observed kernel panics 
>> with rdma
>> and siw transports [1][2]. Also I observed another lockdep WARN with tcp
>> transport [3]. It looks that your fix unveiled more hidden issue/s.
> 
> Please use siw instead of rxe when running blktests - there are known 
> issues with the rxe driver.
> 
> Please apply these patches on top of kernel v6.3 instead of v6.4-rc1. 
> The hrtimer_interrupt() crash shown below is a v6.4-rc1 regression and 
> does not occur with the v6.3 kernel.
> 
> Since my patch is for the RDMA transport, it is not clear to me why a 
> report for the TCP transport is included in a reply to my patch?

Agree,

The same fundamental issue exists in nvmet-tcp as well.

Copying to nvmet-tcp (compile-tested only) gives:
--
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index ed98df72c76b..a4367fa6f55b 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -163,7 +163,9 @@ struct nvmet_tcp_queue {
         struct sockaddr_storage sockaddr;
         struct sockaddr_storage sockaddr_peer;
         struct work_struct      release_work;
-
+#ifdef CONFIG_LOCKDEP
+       struct lock_class_key   key;
+#endif
         int                     idx;
         struct list_head        queue_list;

@@ -1472,6 +1474,10 @@ static void nvmet_tcp_release_queue_work(struct 
work_struct *w)
         list_del_init(&queue->queue_list);
         mutex_unlock(&nvmet_tcp_queue_mutex);

+#ifdef CONFIG_LOCKDEP
+       lockdep_unregister_key(&queue->key);
+#endif
+
         nvmet_tcp_restore_socket_callbacks(queue);
         cancel_work_sync(&queue->io_work);
         /* stop accepting incoming data */
@@ -1654,6 +1660,11 @@ static int nvmet_tcp_alloc_queue(struct 
nvmet_tcp_port *port,
         if (ret)
                 goto out_destroy_sq;

+#ifdef CONFIG_LOCKDEP
+       lockdep_register_key(&queue->key);
+       lockdep_init_map(&queue->release_work.lockdep_map,
+                        "nvmet_tcp_release_work", &queue->key, 0);
+#endif
         return 0;
  out_destroy_sq:
         mutex_lock(&nvmet_tcp_queue_mutex);
--



More information about the Linux-nvme mailing list