[PATCH v7 01/23] net: Introduce direct data placement tcp offload
Jakub Kicinski
kuba at kernel.org
Thu Nov 3 18:57:13 PDT 2022
On Thu, 03 Nov 2022 19:29:33 +0200 Aurelien Aptel wrote:
> Jakub,
>
> We came up with 2 designs for controlling the ULP DDP capability bits
> and getting the ULP DDP statistics.
>
> Both designs share some concepts so I'm going to talk about the common
> stuff first:
>
> We drop the netdev->feature bit. To fully disable ULP DDP offload the
> caps will have to be set to 0x0.
>
> In both design we replace the feature bit with a new field
> netdev->ulp_ddp_caps
>
> struct ulp_ddp_cap {
> bitmap caps_hw; // what the hw supports (filled by the driver, used as reference once initialized)
> bitmap caps_active; // what is currently set for the system, can be modified from userspace
> };
>
> We add a new OP net_device_ops->ndo_set_ulp_caps() that drivers have
> to provide to fill netdev->ulp_ddp_caps.caps_hw. We call it around
> the same time as when we call ndo_set_features().
Sounds good. Just to be clear - I was suggesting:
net_device_ops->ddp_ulp_ops->set_ulp_caps()
so an extra indirection, but if you're worried about the overhead
an ndo is fine, too.
> Interfacing with userspace is where the design differs.
>
> Design A ("netlink"):
> =====================
>
> # Capabilities
>
> We can expose to the users a new ethtool api using netlink.
>
> For this we want to have a dynamic system where userspace doesn't have
> to hardcode all the caps but instead can get a list. We implement
> something similar to what is done for features bits.
>
> We add a table to map caps to string names
>
> const char *ulp_ddp_cap_names[] = {
> [ULP_DDP_NVME_TCP_XXX] = "nvme-tcp-xxx",
> ...
> };
Right, you should be able to define your own strset (grep for
stats_std_names for an example).
> We add ETHTOOL messages to get and set ULP caps:
>
> - ETHTOOL_MSG_ULP_CAPS_GET: get device ulp capabilities
> - ETHTOOL_MSG_ULP_CAPS_SET: set device up capabilities
ULP or DDP? Are you planning to plumb TLS thru the same ops?
Otherwise ULP on its own may be a little too generic of a name.
> The GET reply code can use ethnl_put_bitset32() which does the job of
> sending bits + their names as strings.
>
> The SET code would apply the changes to netdev->ulp_ddp_caps.caps_active.
>
> # Statistics
>
> If the ETHTOOL_MSG_ULP_CAPS_GET message requests statistics (by
Would it make sense to drop the _CAPS from the name, then?
Or replace by something more general, like INFO?
We can call the bitset inside the message CAPS but the message
also carries stats and perhaps other things in the future.
> setting the header flag ETHTOOL_FLAG_STATS) the kernel will append all
> the ULP statistics of the device at the end of the reply.
>
> Those statistics will be dynamic in the sense that they will use a new
> stringset for their names that userspace will have to fetch.
>
> # Ethtool changes
> We will add the -u|-U|--ulp-get|--ulp-set options to ethtool.
>
> # query list of caps supported and their value on device $dev
> ethtool -u|--ulp-get <dev>
>
> # query ULP stats of $dev
> ethtool -u|--ulp-get --include-statistics <dev>
-I|--include-statistics ?
> # set $cap to $val on device $dev
> -U|--ulp-set <dev> <cap> [on|off]
Sounds good!
> Design B ("procfs")
> ===================
>
> In this design we add a new /proc/sys/net/ulp/* hierarchy, under which
> we will add a directory per device (e.g. /proc/sys/net/ulp/eth0/) to
> configure/query ULP DDP.
>
> # Capabilities
>
> # set capabilities per device
> $ echo 0x1 > /proc/sys/net/ulp/<device>/caps
>
> # Statistics
>
> # show per device stats (global and per queue)
> # space separated values, 1 stat per line
> $ cat /proc/sys/net/ulp/<device>/stats
> rx_nvmeotcp_drop 0
> rx_nvmeotcp_resync 403
> rx_nvmeotcp_offload_packets 75614185
> rx_nvmeotcp_offload_bytes 107016641528
> rx_nvmeotcp_sk_add 1
> rx_nvmeotcp_sk_add_fail 0
> rx_nvmeotcp_sk_del 0
> rx_nvmeotcp_ddp_setup 3327969
> rx_nvmeotcp_ddp_setup_fail 0
> rx_nvmeotcp_ddp_teardown 3327969
>
> I can also suggest the existing paths:
>
> - /sys/class/net/<device>/statistics/
> - /proc/net/stat/
>
> Or any other path you will prefer.
Thanks for describing the options! I definitely prefer ethtool/netlink.
More information about the Linux-nvme
mailing list