[PATCH v2 net-next 19/21] net/mlx5e: NVMEoTCP, data-path for DDP offload
David Ahern
dsahern at gmail.com
Mon Jan 18 23:36:30 EST 2021
On 1/17/21 1:42 AM, Boris Pismenny wrote:
> This is needed for a few reasons that are explained in detail
> in the tcp-ddp offload documentation. See patch 21 overview
> and rx-data-path sections. Our reasons are as follows:
I read the documentation patch, and it does not explain it and really
should not since this is very mlx specific based on the changes.
Different h/w will have different limitations. Given that, it would be
best to enhance the patch description to explain why these gymnastics
are needed for the skb.
> 1) Each SKB may contain multiple PDUs. DDP offload doesn't operate on
> PDU headers, so these are written in the receive ring. Therefore, we
> need to rebuild the SKB to account for it. Additionally, due to HW
> limitations, we will only offload the first PDU in the SKB.
Are you referring to LRO skbs here? I can't imagine going through this
for 1500 byte packets that have multiple PDUs.
> 2) The newly constructed SKB represents the original data as it is on
> the wire, such that the network stack is oblivious to the offload.
> 3) We decided not to modify all of the mlx5e_skb_from_cqe* functions
> because it would make the offload harder to distinguish, and it would
> add overhead to the existing data-path fucntions. Therefore, we opted
> for this modular approach.
>
> If we only had generic header-data split, then we just couldn't
> provide this offload. It is not enough to place payload into some
> buffer without TCP headers because RPC protocols and advanced storage
> protocols, such as nvme-tcp, reorder their responses and require data
> to be placed into application/pagecache buffers, which are anything
> but anonymous. In other words, header-data split alone writes data
> to the wrong buffers (reordering), or to anonymous buffers that
> can't be page-flipped to replace application/pagecache buffers.
>
More information about the Linux-nvme
mailing list