[PATCH V3] nvme-tcp: teardown circular lockng fixes
Chaitanya Kulkarni
chaitanyak at nvidia.com
Mon Mar 9 22:45:24 PDT 2026
Sagi,
On 2/27/26 15:12, Chaitanya Kulkarni wrote:
> On 2/25/26 18:56, Chaitanya Kulkarni wrote:
>> When a controller reset is triggered via sysfs (by writing to
>> /sys/class/nvme/<nvmedev>/reset_controller), the reset work tears down
>> and re-establishes all queues. The socket release using fput() defers
>> the actual cleanup to task_work delayed_fput workqueue. This deferred
>> cleanup can race with the subsequent queue re-allocation during reset,
>> potentially leading to use-after-free or resource conflicts.
>>
>> Replace fput() with __fput_sync() to ensure synchronous socket release,
>> guaranteeing that all socket resources are fully cleaned up before the
>> function returns. This prevents races during controller reset where
>> new queue setup may begin before the old socket is fully released.
>>
>> * Call chain during reset:
>> nvme_reset_ctrl_work()
>> -> nvme_tcp_teardown_ctrl()
>> -> nvme_tcp_teardown_io_queues()
>> -> nvme_tcp_free_io_queues()
>> -> nvme_tcp_free_queue() <-- fput() -> __fput_sync()
>> -> nvme_tcp_teardown_admin_queue()
>> -> nvme_tcp_free_admin_queue()
>> -> nvme_tcp_free_queue() <-- fput() -> __fput_sync()
>> -> nvme_tcp_setup_ctrl() <-- race with deferred fput
>>
>> memalloc_noreclaim_save() sets PF_MEMALLOC which is intended for tasks
>> performing memory reclaim work that need reserve access. While PF_MEMALLOC
>> prevents the task from entering direct reclaim (causing __need_reclaim() to
>> return false), it does not strip __GFP_IO from gfp flags. The allocator can
>> therefore still trigger writeback I/O when __GFP_IO remains set, which is
>> unsafe when the caller holds block layer locks.
>>
>> Switch to memalloc_noio_save() which sets PF_MEMALLOC_NOIO. This causes
>> current_gfp_context() to strip __GFP_IO|__GFP_FS from every allocation in
>> the scope, making it safe to allocate memory while holding elevator_lock and
>> set->srcu.
>>
> Can you please take a look when you get a chance ? See also [1].
>
> -ck
>
> [1]
> https://lore.kernel.org/all/20251125005950.41046-1-ckulkarnilinux@gmail.com/
> https://lore.kernel.org/all/20251125005950.41046-1-ckulkarnilinux@gmail.com/
>
>
can you please take a look into this ?
-ck
More information about the Linux-nvme
mailing list