[PATCH v5 1/2] blk-mq: add tagset quiesce interface

Sagi Grimberg sagi at grimberg.me
Tue Jul 28 12:25:54 EDT 2020


>>>>>> I like the tagset based interface.  But the idea of doing a per-hctx
>>>>>> allocation and wait doesn't seem very scalable.
>>>>>>
>>>>>> Paul, do you have any good idea for an interface that waits on
>>>>>> multiple srcu heads?  As far as I can tell we could just have a single
>>>>>> global completion and counter, and each call_srcu would just just
>>>>>> decrement it and then the final one would do the wakeup.  It would just
>>>>>> be great to figure out a way to keep the struct rcu_synchronize and
>>>>>> counter on stack to avoid an allocation.
>>>>>>
>>>>>> But if we can't do with an on-stack object I'd much rather just embedd
>>>>>> the rcu_head in the hw_ctx.
>>>>>
>>>>> I think we can do that, please see the following patch which is against Sagi's V5:
>>>>
>>>> I don't think you can send a single rcu_head to multiple call_srcu calls.
>>>
>>> OK, then one variant is to put the rcu_head into blk_mq_hw_ctx, and put
>>> rcu_synchronize into blk_mq_tag_set.
>>
>> I can cook up a spin, but I still hate the fact that I have a queue that
>> ends up quiesced which I didn't want it to...
> 
> Why do we care so much about the connect_q?  Especially if we generalize
> it into a passthru queue that will absolutely need the quiesce hopefully
> soon.

The connect_q cannot be generalized to a passthru_q, exactly because of
the reason it exists in the first place. There is no way to guarantee
that the connect is issued before any pending request (in case of reset
during traffic).

We can use this API, but we will need to explicitly unquiesce the
connect_q which is a bit ugly like:
--
void nvme_stop_queues(struct nvme_ctrl *ctrl)
{
	blk_mq_quiesce_tagset(ctrl->tagset);
	if (ctrl->connect_q)
		blk_mq_unquiesce_queue(ctrl->connect_q);
}
EXPORT_SYMBOL_GPL(nvme_stop_queues);
--



More information about the Linux-nvme mailing list