[PATCH v3 0/3] Handle number of queue changes

James Smart jsmart2021 at gmail.com
Tue Aug 30 07:38:03 PDT 2022


On 8/30/2022 12:57 AM, Sagi Grimberg wrote:
> 
>>> One does wonder: what about FC?
>>> Does it suffer from the same problems?
>>>
>>> Cheers,
>>>
>>> Hannes
>>
>>
>> Yep, wondering too. I don't think so... FC does do this differently.
>>
>> We don't realloc io queues nor tag_sets.  We reuse the allocations and 
>> tag_set originally created in the 1st successful association for the 
>> controller.
>>
>> On reconnect, we set the queue count, then call 
>> blk_mq_update_nr_hw_queues() if it changed before we get into the 
>> loops to change queue states.
> 
> That is the same as tcp/rdma. We don't realloc tagsets or queues, we
> just reinitialize the queues based on the queue_count.

yeah - true.

> 
>> So FC's call to update nr_hw_queues() is much earlier than rdma/tcp 
>> today.
> 
> The only difference is that you don't freeze the request queues when
> tearing down the controller, so you can allocate/start all the queues in
> one go.

This is related to the way FC has to actually do things on the wire to 
terminate I/O's before we can call outstanding i/o dead, and while on 
the wire doing so... they needed to be unfrozen for other paths to hit 
reject states.

When I look at the mods: I still think it comes down to when the 
transports do the update.  Rdma/TCP's mods are specifically that 
start_io_queues, which loops on (the now changed) queue_count, has to be 
called before update nr_hw_queues is called to update it on the tag_set. 
The patch reverts the initial call limit to whats in the (existing 
pre-change) tag_set. And adds a post call to start all queues in the now 
revised tag_set

FC as it doesn't call a loop routine, updates the tag set before getting 
in trouble.

> 
> In pci/rdma/tcp, we start by freezing the request queues to address a 
> hang that happens with multiple queue-maps (default/read/poll).
> See 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic").
> 
> I don't think that fc supports multiple queue maps, but in case the

Nope we don't. Isn't as meaningful based on the way the transport works.

> number of queue changes, blk_mq_update_nr_hw_queues() will still attempt
> to freeze the request queues, which may lead to a hang if some requests
> may not be able to complete (because the queues are quiesced at this
> time). However, I see that fc starts the queues in the end of
> nvme_fc_delete_association (which is a bit strange because the same can
> be achieved by passing start_queues=true to
> __nvme_fc_abort_outstanding_ios.

Yep - code fragmented a little over time.


> But that is the main difference, tcp/rdma does not start the queues when
> tearing down a controller in a reset, only after we re-establish the 
> queues. I think this was needed to support a non-mpath configurations,
> where IOs do not failover. Maybe that is a legacy thing now for fabrics 
> though...

agree. thus the ordering difference vs update.

-- james




More information about the Linux-nvme mailing list