[RFC v1 0/3] Unifying fabrics drivers

James Smart jsmart2021 at gmail.com
Wed Mar 8 07:13:39 PST 2023


On 3/8/2023 3:38 AM, Sagi Grimberg wrote:
> 
> 
> On 3/8/23 00:09, James Smart wrote:
>> On 3/7/2023 4:28 AM, Daniel Wagner wrote:
>>> On Tue, Mar 07, 2023 at 11:26:12AM +0200, Sagi Grimberg wrote:
>>>> I think we should make all transports to unify setup/teardown sequences
>>>> (i.e. including pcie and fc). Otherwise we are not gaining much.
>>>
>>> Not sure about pcie but I haven't really looked at it. The state 
>>> machine is
>>> really handling all the fabrics specific queue handling.
>>>
>>> Ideally, fc would use the same state machine. Though the current code 
>>> base is
>>> differs quiet significantly but I can try to convert it too.
>>
>> I'll certainly help for FC.
>>
>> The 2 main hurdles are our differences around:
>> a) the new flag - which I replaced in the fc calling setup and a state 
>> flag. Shouldn't be much of an issue. We just need to look at 
>> when/where the tagsets are allocated (and embedding into admin queue 
>> setup isn't necessarily best).
> 
> Does fc create an I/O tagset on every reconnect? Doesn't appear
> that it is removed on every disconnect... I'm a bit confused.

No. It maintains a flag (ioq_live) that is set after first time it 
creates the io queues. This flag replaces the "new" argument. In 
controller connect, after admin/init/aen_init, it uses the flag to
select either a routine that either initially sizes and creates the io 
tag_set or a routine that resizes and blk_mq_update's the io tag set.

fc should have the same functionality as rdma/tcp. the calling sequence 
is just a little different.

> 
>> b) the init_ctrl - I don't have the initial connect inline to the 
>> init_ctrl call - I push it out to the normal "reconnect" path so that 
>> everything, initial connect and all reconnects, use the same 
>> routine/work element. Also says the transport will retry the initial 
>> connect if there's a failure on the initial attempt. To keep the app 
>> happy, init_ctrl will wait for the 1st connect attempt to finish 
>> before returning.   Don't know how rdma/tcp feels about it, but I 
>> found it a much better solution and cleanup became cleaner.
> 
> Don't see a major difference, I personally find the initial setup
> different than a reconnect so its simpler to do them separately.
> 
> Can these two differences change for unification of the transports?

Will I change them to match rdma/tcp ?

The "new" flag isn't a big deal, but I prefer not passing a flag around. 
It was a straight-forward change.

The initial connect - no, I won't change fc. It originally was the same 
as rdma/tcp. But the separate the init create path was causing headaches 
with immediate/parallel errors and reconnect. The code ended up being 
duplicate and competing and it was affecting the way ref-counting was 
being done. It became much cleaner when it was reorganized to reuse the 
existing reconnect path.  Took a lot of time and testing to reach where 
it is.

I don't know how much time/testing has gone into rdma/tcp for injected 
errors, or immediately connectivity loss. FC also can't just stop if the 
initial connect fails (e.g. one of first fabric commands fails due to 
injected error). As everything is automated via sysadm and driven by 
events from host adapters, if we stopped after 1 connect attempt the 
storage wouldn't show up. Having the reconnect path exist even for 
initial connect was beneficial.

However, there's a lot of commonality we can address outside of this.

> 
> I would also like to get rid of the new flag, and just break these
> sequences to different routines.

-- james




More information about the Linux-nvme mailing list