[RFC v1 0/3] Unifying fabrics drivers

Tue Mar 7 14:09:02 PST 2023

On 3/7/2023 4:28 AM, Daniel Wagner wrote:
> On Tue, Mar 07, 2023 at 11:26:12AM +0200, Sagi Grimberg wrote:
>> I think we should make all transports to unify setup/teardown sequences
>> (i.e. including pcie and fc). Otherwise we are not gaining much.
> 
> Not sure about pcie but I haven't really looked at it. The state machine is
> really handling all the fabrics specific queue handling.
> 
> Ideally, fc would use the same state machine. Though the current code base is
> differs quiet significantly but I can try to convert it too.

I'll certainly help for FC.

The 2 main hurdles are our differences around:
a) the new flag - which I replaced in the fc calling setup and a state 
flag. Shouldn't be much of an issue. We just need to look at when/where 
the tagsets are allocated (and embedding into admin queue setup isn't 
necessarily best).

b) the init_ctrl - I don't have the initial connect inline to the 
init_ctrl call - I push it out to the normal "reconnect" path so that 
everything, initial connect and all reconnects, use the same 
routine/work element. Also says the transport will retry the initial 
connect if there's a failure on the initial attempt. To keep the app 
happy, init_ctrl will wait for the 1st connect attempt to finish before 
returning.   Don't know how rdma/tcp feels about it, but I found it a 
much better solution and cleanup became cleaner.

> 
>> It will help if we make the ops higher level like
>>
>> ops.setup_transport(ctrl)
>> ops.alloc_admin_queue(ctrl)
>> ops.start_admin_queue(ctrl)
>> ops.stop_admin_queue(ctrl)
>> ops.free_admin_queue(ctrl)
>> ops.alloc_io_queues(ctrl)
>> ops.start_io_queues(ctrl)
>> ops.stop_io_queues(ctrl)
>> ops.free_io_queues(ctrl)
> 
> This is more ore less what v2 is.
> 
>> The init/deinit can be folded to alloc/free I think.
> 
> Yes, that would a good thing. In my first attempt I tried to keep things more a
> less identically with the existing code, just using the callbacks for the
> transport specific bits.
> 
>> This is indeed a much larger effort, but I don't know what
>> this unification of rdma/tcp buys us really...
> 
> My motiviation is that we are keep fixing the same bugs in rdma and tcp
> transport since a while. The state machine code is almost identically so why not
> trying to reduce the code duplication?
> 
> (*) There are a few small things which are not the same.