[PATCH 2/2] nvme-multipath: don't block on blk_queue_enter of the underlying device

Keith Busch kbusch at kernel.org
Mon Mar 22 15:31:01 GMT 2021


On Mon, Mar 22, 2021 at 08:37:26AM +0100, Christoph Hellwig wrote:
> When we reset/teardown a controller, we must freeze and quiesce the
> namespaces request queues to make sure that we safely stop inflight I/O
> submissions. Freeze is mandatory because if our hctx map changed between
> reconnects, blk_mq_update_nr_hw_queues will immediately attempt to freeze
> the queue, and if it still has pending submissions (that are still
> quiesced) it will hang.
> 
> However, by freezing the namespaces request queues, and only unfreezing
> them when we successfully reconnect, inflight submissions that are
> running concurrently can now block grabbing the nshead srcu until either
> we successfully reconnect or ctrl_loss_tmo expired (or the user
> explicitly disconnected).
> 
> This caused a deadlock when a different controller (different path on the
> same subsystem) became live (i.e. optimized/non-optimized). This is
> because nvme_mpath_set_live needs to synchronize the nshead srcu before
> requeueing I/O in order to make sure that current_path is visible to
> future (re-)submisions. However the srcu lock is taken by a blocked
> submission on a frozen request queue, and we have a deadlock.
> 
> In order to fix this use the blk_mq_submit_bio_direct API to submit the
> bio to the low-level driver, which does not block on the queue free
> but instead allows nvme-multipath to pick another path or queue up the
> bio.

Looks good.

Reviewed-by: Keith Busch <kbusch at kernel.org>



More information about the Linux-nvme mailing list