[RFC v1 0/3] Unifying fabrics drivers
James Smart
jsmart2021 at gmail.com
Tue Mar 7 14:09:02 PST 2023
On 3/7/2023 4:28 AM, Daniel Wagner wrote:
> On Tue, Mar 07, 2023 at 11:26:12AM +0200, Sagi Grimberg wrote:
>> I think we should make all transports to unify setup/teardown sequences
>> (i.e. including pcie and fc). Otherwise we are not gaining much.
>
> Not sure about pcie but I haven't really looked at it. The state machine is
> really handling all the fabrics specific queue handling.
>
> Ideally, fc would use the same state machine. Though the current code base is
> differs quiet significantly but I can try to convert it too.
I'll certainly help for FC.
The 2 main hurdles are our differences around:
a) the new flag - which I replaced in the fc calling setup and a state
flag. Shouldn't be much of an issue. We just need to look at when/where
the tagsets are allocated (and embedding into admin queue setup isn't
necessarily best).
b) the init_ctrl - I don't have the initial connect inline to the
init_ctrl call - I push it out to the normal "reconnect" path so that
everything, initial connect and all reconnects, use the same
routine/work element. Also says the transport will retry the initial
connect if there's a failure on the initial attempt. To keep the app
happy, init_ctrl will wait for the 1st connect attempt to finish before
returning. Don't know how rdma/tcp feels about it, but I found it a
much better solution and cleanup became cleaner.
>
>> It will help if we make the ops higher level like
>>
>> ops.setup_transport(ctrl)
>> ops.alloc_admin_queue(ctrl)
>> ops.start_admin_queue(ctrl)
>> ops.stop_admin_queue(ctrl)
>> ops.free_admin_queue(ctrl)
>> ops.alloc_io_queues(ctrl)
>> ops.start_io_queues(ctrl)
>> ops.stop_io_queues(ctrl)
>> ops.free_io_queues(ctrl)
>
> This is more ore less what v2 is.
>
>> The init/deinit can be folded to alloc/free I think.
>
> Yes, that would a good thing. In my first attempt I tried to keep things more a
> less identically with the existing code, just using the callbacks for the
> transport specific bits.
>
>> This is indeed a much larger effort, but I don't know what
>> this unification of rdma/tcp buys us really...
>
> My motiviation is that we are keep fixing the same bugs in rdma and tcp
> transport since a while. The state machine code is almost identically so why not
> trying to reduce the code duplication?
>
> (*) There are a few small things which are not the same.
More information about the Linux-nvme
mailing list