[PATCHv6 1/1] nvme-tcp: Add option to set the physical interface to be used when connecting over TCP sockets.

Belanger, Martin Martin.Belanger at dell.com
Thu May 20 11:48:16 PDT 2021


> On 5/17/21 11:16 AM, Martin Belanger wrote:
> > From: Martin Belanger <martin.belanger at dell.com>
> >
> > Addressed Sagi's review from PATCHv5.
> 
> This commentary belongs after the '---' separator.
> 
> >
> > In our application, we need a way to force TCP connections to go out a
> > specific IP interface instead of letting Linux select the interface
> > based on the routing tables. This patch adds the option 'host-iface'
> > to allow specifying the interface to use. Note that corresponding
> > changes to the nvme-cli utility will follow.
> >
> > When the option host-iface is specified, the driver uses the specified
> > interface to set the option SO_BINDTODEVICE on the TCP socket before
> > connecting.
> >
> > This new option is needed in addtion to the existing host-traddr for
> > the following reasons:
> >
> > Specifying an IP interface by its associated IP address is less
> > intuitive than specifying the actual interface name and, in some
> > cases, simply doesn't work. That's because the association between
> > interfaces and IP addresses is not predictable. IP addresses can be
> > changed or can change by themselves over time (e.g. DHCP). Interface
> > names are predictable [1] and will persist over time. Consider the
> > following configuration.
> >
> > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state ...
> >      link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> >      inet 100.0.0.100/24 scope global lo
> >         valid_lft forever preferred_lft forever
> > 2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
> >      link/ether 08:00:27:21:65:ec brd ff:ff:ff:ff:ff:ff
> >      inet 100.0.0.100/24 scope global enp0s3
> >         valid_lft forever preferred_lft forever
> > 3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
> >      link/ether 08:00:27:4f:95:5c brd ff:ff:ff:ff:ff:ff
> >      inet 100.0.0.100/24 scope global enp0s8
> >         valid_lft forever preferred_lft forever
> >
> > The above is a VM that I configured with the same IP address
> > (100.0.0.100) on all interfaces. Doing a reverse lookup to identify
> > the unique interface associated with 100.0.0.100 does not work here.
> > And this is why the option host_iface is required. I understand that
> > the above config does not represent a standard host system, but I'm
> > using this to prove a point: "We can never know how users will
> > configure their systems". By te way, The above configuration is
> > perfectly fine by Linux.
> >
> > The current TCP implementation for host_traddr performs a
> > bind()-before-connect(). This is a common construct to set the source
> > IP address on a TCP socket before connecting. This has no effect on
> > how Linux selects the interface for the connection. That's because
> > Linux uses the Weak End System model as described in RFC1122 [2]. On
> > the other hand, setting the Source IP Address has benefits and should
> > be supported by linux-nvme. In fact, setting the Source IP Address is
> > a mandatory FedGov requirement (e.g. connection to a RADIUS/TACACS+
> server).
> > Consider the following configuration.
> >
> > $ ip addr list dev enp0s8
> > 3: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc ...
> >      link/ether 08:00:27:4f:95:5c brd ff:ff:ff:ff:ff:ff
> >      inet 192.168.56.101/24 brd 192.168.56.255 scope global enp0s8
> >         valid_lft 426sec preferred_lft 426sec
> >      inet 192.168.56.102/24 scope global secondary enp0s8
> >         valid_lft forever preferred_lft forever
> >      inet 192.168.56.103/24 scope global secondary enp0s8
> >         valid_lft forever preferred_lft forever
> >      inet 192.168.56.104/24 scope global secondary enp0s8
> >         valid_lft forever preferred_lft forever
> >
> > Here we can see that several addresses are associated with interface
> > enp0s8. By default, Linux always selects the default IP address,
> > 192.168.56.101, as the source address when connecting over interface
> > enp0s8. Some users, however, want the ability to specify a different
> > source address (e.g., 192.168.56.102, 192.168.56.103, ...). The option
> > host_traddr can be used as-is to perform this function.
> >
> > In conclusion, I believe that we need 2 options for TCP connections.
> > One that can be used to specify an interface (host-iface). And one
> > that can be used to set the source address (host-traddr). Users should
> > be allowed to use one or the other, or both, or none. Of course, the
> > documentation for host_traddr will need some clarification. It should
> > state that when used for TCP connection, this option only sets the
> > source address. And the documentation for host_iface should say that
> > this option is only available for TCP connections.
> >
> > References:
> > [1]
> > https://urldefense.com/v3/__https://www.freedesktop.org/wiki/Software/
> > systemd/*5C__;JQ!!LpKI!3qE5jJQA-REQkOr1c042U-
> ghm28oHvTE48YZkHM5ugob8Sm
> > IPPIHxwEm7iwkC9kZyA$ [freedesktop[.]org]
> > PredictableNetworkInterfaceNames/ [2]
> > https://urldefense.com/v3/__https://tools.ietf.org/html/rfc1122__;!!Lp
> > KI!3qE5jJQA-REQkOr1c042U-
> ghm28oHvTE48YZkHM5ugob8SmIPPIHxwEm7ixiy1Q97A$
> > [tools[.]ietf[.]org]
> >
> > Tested both IPv4 and IPv6 connections.
> 
> Also this.
> 
> Can you send the nvme-cli bits as well?

Hi Sagi,

Just checking if there anything else I can do to help with this patch?

The corresponding nvme-cli changes can be inspected in Github at the following link.
https://github.com/martin-belanger/nvme-cli/commit/628aca9d66ddffaa78bdaa46668ecdc3d000a017

Note that I will only submit the nvme-cli changes after this nvme-tcp patch has been approved.

Thanks,
Martin



More information about the Linux-nvme mailing list