[PATCH 1/1] Add 'Transport Interface' (triface) option. This can be used to specify the IP interface to use for the connection. The driver uses that to set SO_BINDTODEVICE on the socket before connecting.

Sagi Grimberg sagi at grimberg.me
Tue May 11 18:13:01 BST 2021


> I've been thinking about what you said regarding the need to repeat the -w option when two interfaces share the same IP address. I think we're looking at the problem from a different point of view. The current implementation uses an IP address to identify an interface. I, on the other hand, believe that the best way to identify an interface is by its "interface name or index". In previous emails, I provided examples of the problems that may occur when using an IP address to identify an interface. For example, one can assign the same IP address to different interfaces making it impossible to distinguish interfaces by their IP address alone. Another example is that the low level APIs (e.g. setsockopt(SO_BINDTODEVICE) don’t even require the source IP address. They only need the interface name/index. So, why go through the trouble of performing a reverse address lookup to retrieve the interface name/index when the address is not used at all?
> 
> By the way, if nvme-cli/linux-nvme allowed specifying interfaces by name/index, then we would not really need to repeat the -w option unless we also wanted to set the source address at the same time. Setting the source address is a completely different thing from setting the interface. One should be allowed to set one independently from the other, or both, or none.
> 
> If you look at how ping is implemented, they do not infer the interface from the IP address. If one wants to force ping to go over an interface, then one must provide the interface by name/index using the -I option. If one wants to change the source IP address (without forcing a specific interface), then one provides the IP address to the -I option. It's simple and intuitive. And ping also supports appending the interface to the Destination IP using the '%' delimiter for IPv6-only as per RFC4007.
> 
> I think that nvme-cli/linux-nvme should follow the ping approach. Interfaces should never be inferred from source IP addresses, but instead be clearly identified by their name or index. And setting the source address should be independent from setting the interface.

I'm starting to think that we are going in circles, I'm getting to
the point that having host_iface is the right way to go.

We can have nvme-cli convert <addr>%iface notation to
"..,host_traddr=<addr>,host_iface=<iface>,.." when creating the
controller string...



More information about the Linux-nvme mailing list