[PATCH v2 1/2] blk-mq: add tagset quiesce interface
Chao Leng
lengchao at huawei.com
Tue Oct 18 19:39:41 PDT 2022
On 2022/10/18 23:04, Paul E. McKenney wrote:
> On Tue, Oct 18, 2022 at 05:52:06PM +0800, Chao Leng wrote:
>> On 2022/10/17 23:21, Paul E. McKenney wrote:
>>> On Mon, Oct 17, 2022 at 03:39:06PM +0200, Christoph Hellwig wrote:
>>>> On Thu, Oct 13, 2022 at 05:44:49PM +0800, Chao Leng wrote:
>>>>> + rcu = kvmalloc(count * sizeof(*rcu), GFP_KERNEL);
>>>>> + if (rcu) {
>>>>> + list_for_each_entry(q, &set->tag_list, tag_set_list) {
>>>>> + if (blk_queue_noquiesced(q))
>>>>> + continue;
>>>>> +
>>>>> + init_rcu_head(&rcu[i].head);
>>>>> + init_completion(&rcu[i].completion);
>>>>> + call_srcu(q->srcu, &rcu[i].head, wakeme_after_rcu);
>>>>> + i++;
>>>>> + }
>>>>> +
>>>>> + for (i = 0; i < count; i++) {
>>>>> + wait_for_completion(&rcu[i].completion);
>>>>> + destroy_rcu_head(&rcu[i].head);
>>>>> + }
>>>>> + kvfree(rcu);
>>>>> + } else {
>>>>> + list_for_each_entry(q, &set->tag_list, tag_set_list)
>>>>> + synchronize_srcu(q->srcu);
>>>>> + }
>>>>
>>>> Having to allocate a struct rcu_synchronize for each of the potentially
>>>> many queues here is a bit sad.
>>>>
>>>> Pull just explained the start_poll_synchronize_rcu interfaces at ALPSS
>>>> last week, so I wonder if something like that would also be feasible
>>>> for SRCU, as that would come in really handy here.
>>>
>>> There is start_poll_synchronize_srcu() and poll_state_synchronize_srcu(),
>>> but there would need to be an unsigned long for each srcu_struct from
>>> which an SRCU grace period was required. This would be half the size
>>> of the "rcu" array above, but still maybe larger than you would like.
>>>
>>> The resulting code might look something like this, with "rcu" now being
>>> a pointer to unsigned long:
>>>
>>> rcu = kvmalloc(count * sizeof(*rcu), GFP_KERNEL);
>>> if (rcu) {
>>> list_for_each_entry(q, &set->tag_list, tag_set_list) {
>>> if (blk_queue_noquiesced(q))
>>> continue;
>>> rcu[i] = start_poll_synchronize_srcu(q->srcu);
>>> i++;
>>> }
>>>
>>> for (i = 0; i < count; i++)
>>> if (!poll_state_synchronize_srcu(q->srcu))
>>> synchronize_srcu(q->srcu);
>> synchronize_srcu will restart a new period of grace.
>
> True, but SRCU grace periods normally complete reasonably quickly, so
> the synchronize_srcu() might well be faster than the loop, depending on
> what the corresponding SRCU readers are doing.
Yes, Different runtimes have different wait times, and it's hard to say
which is better or worse.
>
>> Maybe it would be better like this:
>> while (!poll_state_synchronize_srcu(q->srcu, rcu[i]))
>> schedule_timeout_uninterruptible(1);
>
> Why not try it both ways and see what happens? Assuming that is, that
> the differences matter in this code.
>
> Thanx, Paul
>
>>> kvfree(rcu);
>>> } else {
>>> list_for_each_entry(q, &set->tag_list, tag_set_list)
>>> synchronize_srcu(q->srcu);
>>> }
>>>
>>> Or as Christoph suggested, just have a single srcu_struct for the
>>> whole group.
>>>
>>> The main reason for having multiple srcu_struct structures is to
>>> prevent the readers from one from holding up the updaters from another.
>>> Except that by waiting for the multiple grace periods, you are losing
>>> that property anyway, correct? Or is this code waiting on only a small
>>> fraction of the srcu_struct structures associated with blk_queue?
>>>
>>> Thanx, Paul
>>> .
>>>
> .
>
More information about the Linux-nvme
mailing list