[PATCHv2] nvme: use srcu for iterating namespace list

Sagi Grimberg sagi at grimberg.me
Fri May 24 08:29:10 PDT 2024



On 24/05/2024 7:41, Shinichiro Kawasaki wrote:
> On May 23, 2024 / 10:20, Keith Busch wrote:
>> From: Keith Busch <kbusch at kernel.org>
>>
>> The nvme pci driver synchronizes with all the namespace queues during a
>> reset to ensure that there's no pending timeout work.
>>
>> Meanwhile the timeout work potentially iterates those same namespaces to
>> freeze their queues.
>>
>> Each of those namespace iterations use the same read lock. If a write
>> lock should somehow get between the synchronize and freeze steps, then
>> forward progress is deadlocked.
>>
>> We had been relying on the nvme controller state machine to ensure the
>> reset work wouldn't conflict with timeout work. That guarantee may be a
>> bit fragile to rely on, so iterate the namespace lists without taking a
>> lock to fix potential circular locks, as reported by lockdep.
>>
>> Link: https://lore.kernel.org/all/20220930001943.zdbvolc3gkekfmcv@shindev/
>> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki at wdc.com>
>> Signed-off-by: Keith Busch <kbusch at kernel.org>
> Keith, thank you very much for the patch. Sagi, Christoph, thank you for the
> discussion.
>
> I confirmed this patch avoids the lockdep WARN that I reported. I also ran my
> test set with the patch and observed no regression. Looks good.
>
> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki at wdc.com>

I'm assuming that all blktests ran with this change? including fabrics 
and permutations?



More information about the Linux-nvme mailing list