[PATCH v3 0/3] Handle number of queue changes
James Smart
jsmart2021 at gmail.com
Tue Aug 30 07:38:03 PDT 2022
On 8/30/2022 12:57 AM, Sagi Grimberg wrote:
>
>>> One does wonder: what about FC?
>>> Does it suffer from the same problems?
>>>
>>> Cheers,
>>>
>>> Hannes
>>
>>
>> Yep, wondering too. I don't think so... FC does do this differently.
>>
>> We don't realloc io queues nor tag_sets. We reuse the allocations and
>> tag_set originally created in the 1st successful association for the
>> controller.
>>
>> On reconnect, we set the queue count, then call
>> blk_mq_update_nr_hw_queues() if it changed before we get into the
>> loops to change queue states.
>
> That is the same as tcp/rdma. We don't realloc tagsets or queues, we
> just reinitialize the queues based on the queue_count.
yeah - true.
>
>> So FC's call to update nr_hw_queues() is much earlier than rdma/tcp
>> today.
>
> The only difference is that you don't freeze the request queues when
> tearing down the controller, so you can allocate/start all the queues in
> one go.
This is related to the way FC has to actually do things on the wire to
terminate I/O's before we can call outstanding i/o dead, and while on
the wire doing so... they needed to be unfrozen for other paths to hit
reject states.
When I look at the mods: I still think it comes down to when the
transports do the update. Rdma/TCP's mods are specifically that
start_io_queues, which loops on (the now changed) queue_count, has to be
called before update nr_hw_queues is called to update it on the tag_set.
The patch reverts the initial call limit to whats in the (existing
pre-change) tag_set. And adds a post call to start all queues in the now
revised tag_set
FC as it doesn't call a loop routine, updates the tag set before getting
in trouble.
>
> In pci/rdma/tcp, we start by freezing the request queues to address a
> hang that happens with multiple queue-maps (default/read/poll).
> See 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic").
>
> I don't think that fc supports multiple queue maps, but in case the
Nope we don't. Isn't as meaningful based on the way the transport works.
> number of queue changes, blk_mq_update_nr_hw_queues() will still attempt
> to freeze the request queues, which may lead to a hang if some requests
> may not be able to complete (because the queues are quiesced at this
> time). However, I see that fc starts the queues in the end of
> nvme_fc_delete_association (which is a bit strange because the same can
> be achieved by passing start_queues=true to
> __nvme_fc_abort_outstanding_ios.
Yep - code fragmented a little over time.
> But that is the main difference, tcp/rdma does not start the queues when
> tearing down a controller in a reset, only after we re-establish the
> queues. I think this was needed to support a non-mpath configurations,
> where IOs do not failover. Maybe that is a legacy thing now for fabrics
> though...
agree. thus the ordering difference vs update.
-- james
More information about the Linux-nvme
mailing list