[PATCH v2] nvme: tcp: avoid race between queue_lock lock and destroy

Sagi Grimberg sagi at grimberg.me
Sun Oct 20 17:11:53 PDT 2024




On 03/10/2024 23:13, Keith Busch wrote:
> On Wed, Oct 02, 2024 at 01:51:41PM +0900, Shin'ichiro Kawasaki wrote:
>> From: Hannes Reinecke <hare at suse.de>
>>
>> Commit 76d54bf20cdc ("nvme-tcp: don't access released socket during
>> error recovery") added a mutex_lock() call for the queue->queue_lock
>> in nvme_tcp_get_address(). However, the mutex_lock() races with
>> mutex_destroy() in nvme_tcp_free_queue(), and causes the WARN below.
> <snip>
>
>> The WARN is observed when the blktests test case nvme/014 is repeated
>> with tcp transport. It is rare, and 200 times repeat is required to
>> recreate in some test environments.
>>
>> To avoid the WARN, check the NVME_TCP_Q_LIVE flag before locking
>> queue->queue_lock. The flag is cleared long time before the lock gets
>> destroyed.
> I've applied this to nvme-6.12, but the existence of this queue_lock
> seems strange. It looks like tcp is relying on blk-mq's timeout to
> individually complete requests after the queue is stopped, but I feel
> like there should be a way to complete everying in a single batch. We
> have the generic nvme_cancel_request() for this reason, but fabrics has
> it's own other way to do it once at a time?

It has traces somewhere in git history... the original introduction of this
lock was to make the timeout handler and the queue stop mutually 
exclusive...

It may probably be reworked, but the dereference here needs to serialize 
with
the stop that is releasing the socket...



More information about the Linux-nvme mailing list