[PATCH v2 1/2] blk-mq: add tagset quiesce interface
Chao Leng
lengchao at huawei.com
Tue Oct 18 02:52:06 PDT 2022
On 2022/10/17 23:21, Paul E. McKenney wrote:
> On Mon, Oct 17, 2022 at 03:39:06PM +0200, Christoph Hellwig wrote:
>> On Thu, Oct 13, 2022 at 05:44:49PM +0800, Chao Leng wrote:
>>> + rcu = kvmalloc(count * sizeof(*rcu), GFP_KERNEL);
>>> + if (rcu) {
>>> + list_for_each_entry(q, &set->tag_list, tag_set_list) {
>>> + if (blk_queue_noquiesced(q))
>>> + continue;
>>> +
>>> + init_rcu_head(&rcu[i].head);
>>> + init_completion(&rcu[i].completion);
>>> + call_srcu(q->srcu, &rcu[i].head, wakeme_after_rcu);
>>> + i++;
>>> + }
>>> +
>>> + for (i = 0; i < count; i++) {
>>> + wait_for_completion(&rcu[i].completion);
>>> + destroy_rcu_head(&rcu[i].head);
>>> + }
>>> + kvfree(rcu);
>>> + } else {
>>> + list_for_each_entry(q, &set->tag_list, tag_set_list)
>>> + synchronize_srcu(q->srcu);
>>> + }
>>
>> Having to allocate a struct rcu_synchronize for each of the potentially
>> many queues here is a bit sad.
>>
>> Pull just explained the start_poll_synchronize_rcu interfaces at ALPSS
>> last week, so I wonder if something like that would also be feasible
>> for SRCU, as that would come in really handy here.
>
> There is start_poll_synchronize_srcu() and poll_state_synchronize_srcu(),
> but there would need to be an unsigned long for each srcu_struct from
> which an SRCU grace period was required. This would be half the size
> of the "rcu" array above, but still maybe larger than you would like.
>
> The resulting code might look something like this, with "rcu" now being
> a pointer to unsigned long:
>
> rcu = kvmalloc(count * sizeof(*rcu), GFP_KERNEL);
> if (rcu) {
> list_for_each_entry(q, &set->tag_list, tag_set_list) {
> if (blk_queue_noquiesced(q))
> continue;
> rcu[i] = start_poll_synchronize_srcu(q->srcu);
> i++;
> }
>
> for (i = 0; i < count; i++)
> if (!poll_state_synchronize_srcu(q->srcu))
> synchronize_srcu(q->srcu);
synchronize_srcu will restart a new period of grace.
Maybe it would be better like this:
while (!poll_state_synchronize_srcu(q->srcu, rcu[i]))
schedule_timeout_uninterruptible(1);
> kvfree(rcu);
> } else {
> list_for_each_entry(q, &set->tag_list, tag_set_list)
> synchronize_srcu(q->srcu);
> }
>
> Or as Christoph suggested, just have a single srcu_struct for the
> whole group.
>
> The main reason for having multiple srcu_struct structures is to
> prevent the readers from one from holding up the updaters from another.
> Except that by waiting for the multiple grace periods, you are losing
> that property anyway, correct? Or is this code waiting on only a small
> fraction of the srcu_struct structures associated with blk_queue?
>
> Thanx, Paul
> .
>
More information about the Linux-nvme
mailing list