[PATCH V2] nvmet: move async event work off nvmet-wq

Sun Apr 12 17:11:03 PDT 2026

Keith,

On 3/9/26 22:44, Chaitanya Kulkarni wrote:
> On 2/25/26 20:30, Chaitanya Kulkarni wrote:
>> For target nvmet_ctrl_free() flushes ctrl->async_event_work.
>> If nvmet_ctrl_free() runs on nvmet-wq, the flush re-enters workqueue
>> completion for the same worker:-
>>
>> A. Async event work queued on nvmet-wq (prior to disconnect):
>>    nvmet_execute_async_event()
>>       queue_work(nvmet_wq, &ctrl->async_event_work)
>>
>>    nvmet_add_async_event()
>>       queue_work(nvmet_wq, &ctrl->async_event_work)
>>
>> B. Full pre-work chain (RDMA CM path):
>>    nvmet_rdma_cm_handler()
>>       nvmet_rdma_queue_disconnect()
>>         __nvmet_rdma_queue_disconnect()
>>           queue_work(nvmet_wq, &queue->release_work)
>>             process_one_work()
>>               lock((wq_completion)nvmet-wq)  <--------- 1st
>>               nvmet_rdma_release_queue_work()
>>
>> C. Recursive path (same worker):
>>    nvmet_rdma_release_queue_work()
>>       nvmet_rdma_free_queue()
>>         nvmet_sq_destroy()
>>           nvmet_ctrl_put()
>>             nvmet_ctrl_free()
>>               flush_work(&ctrl->async_event_work)
>>                 __flush_work()
>>                   touch_wq_lockdep_map()
>>                   lock((wq_completion)nvmet-wq) <--------- 2nd
>>
>> Lockdep splat:
>>
>>    ============================================
>>    WARNING: possible recursive locking detected
>>    6.19.0-rc3nvme+ #14 Tainted: G                 N
>>    --------------------------------------------
>>    kworker/u192:42/44933 is trying to acquire lock:
>>    ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: 
>> touch_wq_lockdep_map+0x26/0x90
>>
>>    but task is already holding lock:
>>    ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: 
>> process_one_work+0x53e/0x660
>>
>>    3 locks held by kworker/u192:42/44933:
>>     #0: ffff888118a00948 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: 
>> process_one_work+0x53e/0x660
>>     #1: ffffc9000e6cbe28 
>> ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: 
>> process_one_work+0x1c5/0x660
>>     #2: ffffffff82d4db60 (rcu_read_lock){....}-{1:3}, at: 
>> __flush_work+0x62/0x530
>>
>>    Workqueue: nvmet-wq nvmet_rdma_release_queue_work [nvmet_rdma]
>>    Call Trace:
>>     __flush_work+0x268/0x530
>>     nvmet_ctrl_free+0x140/0x310 [nvmet]
>>     nvmet_cq_put+0x74/0x90 [nvmet]
>>     nvmet_rdma_free_queue+0x23/0xe0 [nvmet_rdma]
>>     nvmet_rdma_release_queue_work+0x19/0x50 [nvmet_rdma]
>>     process_one_work+0x206/0x660
>>     worker_thread+0x184/0x320
>>     kthread+0x10c/0x240
>>     ret_from_fork+0x319/0x390
>>
>> Move async event work to a dedicated nvmet-aen-wq to avoid reentrant
>> flush on nvmet-wq.
>>
>> Signed-off-by: Chaitanya Kulkarni<kch at nvidia.com>
>> ---
>
>
> can we please merge this ?
>
> -ck
>
>
Looks like this patch is not merged can you merge this ?
It has Christoph's reviewed-by :-

https://lists.infradead.org/pipermail/linux-nvme/2026-February/061381.html

-ck