[PATCH V8 0/4] blk-mq: implement queue quiesce via percpu_ref for BLK_MQ_F_BLOCKING

Chao Leng lengchao at huawei.com
Thu Feb 25 22:51:17 EST 2021


About nvme_stop_queues need long times for large number namespaces,
If work with multipath and one path fails, It cause wait long times
to fail over to retry, and the more namespaces the longer the time.
This has a great impact on delay-sensitive services.
there are two options to fix it:
1. Use percpu instead of SRCU. Ming's patchset.
2. Use tagset quiesce interface with SRCU. Sagi's patchset.
The two patchsets are still pending.

It is a serious bug, I expect that we can revisit the solution.
Maybe we don't have the best option, but we need to choose a relatively
acceptable option.

Can we fix the bug for non-blocking queues(which used by fc&rdma) first?

Sagi & Ming, what do you think?
Thank you.

On 2020/10/20 16:55, Ming Lei wrote:
> Hi Jens,
> 
> The 1st patch add .mq_quiesce_mutex for serializing quiesce/unquiesce,
> and prepares for replacing srcu with percpu_ref.
> 
> The 2nd patch replaces srcu with percpu_ref.
> 
> The 3rd patch adds tagset quiesce interface.
> 
> The 4th patch applies tagset quiesce interface for NVMe subsystem.
> 
> V8:
> 	- rebase on latest linus tree, only there is small fuzz change on 2/4
> 
> V7:
> 	- base on latest for-5.10/block, only there is small change on 2/4
> 
> V6:
> 	- base on for-5.10/block directly, instead of being against on patchset of
> 	'percpu_ref & block: reduce memory footprint of percpu_ref in fast path',
> 	because these patches don't depend on that patchset.
> 
> V5:
> 	- warn once in case that driver unquiesces its queue being
> 	  quiesce and not done, only patch 2 is modified
> 
> V4:
> 	- remove .mq_quiesce_mutex, and switch to test_and_[set|clear] for
> 	avoiding duplicated quiesce action
> 	- pass blktests(block, nvme)
> 
> V3:
> 	- add tagset quiesce interface
> 	- apply tagset quiesce interface for NVMe
> 	- pass blktests(block, nvme)
> 
> V2:
> 	- add .mq_quiesce_lock
> 	- add comment on patch 2 wrt. handling hctx_lock() failure
> 	- trivial patch style change
> 
> 
> Ming Lei (3):
>    block: use test_and_{clear|test}_bit to set/clear QUEUE_FLAG_QUIESCED
>    blk-mq: implement queue quiesce via percpu_ref for BLK_MQ_F_BLOCKING
>    blk-mq: add tagset quiesce interface
> 
> Sagi Grimberg (1):
>    nvme: use blk_mq_[un]quiesce_tagset
> 
>   block/blk-core.c         |  13 +++
>   block/blk-mq-sysfs.c     |   2 -
>   block/blk-mq.c           | 182 +++++++++++++++++++++++++--------------
>   block/blk-sysfs.c        |   6 +-
>   block/blk.h              |   2 +
>   drivers/nvme/host/core.c |  19 ++--
>   include/linux/blk-mq.h   |  10 +--
>   include/linux/blkdev.h   |   4 +
>   8 files changed, 154 insertions(+), 84 deletions(-)
> 
> Cc: Hannes Reinecke <hare at suse.de>
> Cc: Sagi Grimberg <sagi at grimberg.me>
> Cc: Bart Van Assche <bvanassche at acm.org>
> Cc: Johannes Thumshirn <Johannes.Thumshirn at wdc.com>
> Cc: Chao Leng <lengchao at huawei.com>
> 



More information about the Linux-nvme mailing list