[PATCH 2/2] nvme-multipath: don't block on blk_queue_enter of the underlying device

Christoph Hellwig hch at lst.de
Tue Mar 23 16:15:44 GMT 2021


On Tue, Mar 23, 2021 at 12:36:40AM -0700, Sagi Grimberg wrote:
>> The process:
>> 1.nvme_ns_head_submit_bio call srcu_read_lock(&head->srcu).
>> 2.nvme_ns_head_submit_bio will add the bio to current->bio_list instead of 
>> waiting for the frozen queue.
>
> Nothing guarantees that you have a bio_list active at any point in time,
> in fact for a workload that submits one by one you will always drain
> that list directly in the submission...

It should always be active when ->submit_bio is called.

>
>> 3.nvme_ns_head_submit_bio call srcu_read_unlock(&head->srcu, srcu_idx).
>> So nvme_ns_head_submit_bio do not hold head->srcu long when the queue is 
>> frozen, can avoid deadlock.
>>
>> Sagi, suggest trying this patch.
>
> The above reproduces with the patch applied on upstream nvme code.

Weird.  I don't think the deadlock in your original report should
happen due to this.  Can you take a look at the callstacks in the
reproduced deadlock?  Either we're missing something obvious or it is a
a somewhat different deadlock.



More information about the Linux-nvme mailing list