[PATCH 2/2] nvme-multipath: don't block on blk_queue_enter of the underlying device

Chao Leng lengchao at huawei.com
Tue Mar 23 08:13:09 GMT 2021



On 2021/3/23 15:36, Sagi Grimberg wrote:
> 
>> I check it again. I still think the below patch can avoid the bug.
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5a6c35f9af416114588298aa7a90b15bbed15a41 
> 
> I don't understand what you are saying...
> 
>>
>> The process:
>> 1.nvme_ns_head_submit_bio call srcu_read_lock(&head->srcu).
>> 2.nvme_ns_head_submit_bio will add the bio to current->bio_list instead of waiting for the frozen queue.
> 
> Nothing guarantees that you have a bio_list active at any point in time,
> in fact for a workload that submits one by one you will always drain
> that list directly in the submission...
submit_bio and nvme_requeue_work both guarantee current->bio_list.
The process:
1.submit_bio and  nvme_requeue_work will call submit_bio_noacct.
2.submit_bio_noacct will call __submit_bio_noacct because bio->bi_disk->fops->submit_bio = nvme_ns_head_submit_bio.
3.__submit_bio_noacct set current->bio_list, and then __submit_bio will call bio->bi_disk->fops->submit_bio(nvme_ns_head_submit_bio)
4.nvme_ns_head_submit_bio will add the bio to current->bio_list.
5.__submit_bio_noacct drain current->bio_list.
when drain current->bio_list, it will wait for the frozen queue but do not hold the head->srcu.
Because it call blk_mq_submit_bio directly instead of ->submit_bio(nvme_ns_head_submit_bio).
So it is safe.
> 
>> 3.nvme_ns_head_submit_bio call srcu_read_unlock(&head->srcu, srcu_idx).
>> So nvme_ns_head_submit_bio do not hold head->srcu long when the queue is frozen, can avoid deadlock.
>>
>> Sagi, suggest trying this patch.
> 
> The above reproduces with the patch applied on upstream nvme code.The new patch(blk_mq_submit_bio_direct) will cause the bug again.
Because it revert add the bio to current->bio_list.
Just try the upstream nvme code, and do not apply the new patch(blk_mq_submit_bio_direct).
> .



More information about the Linux-nvme mailing list