[PATCH v3] nvme-tcp: Print actual source IP address through sysfs "address" attr
Belanger, Martin
Martin.Belanger at dell.com
Tue Sep 6 08:32:47 PDT 2022
> >>> TCP transport relies on the routing table to determine which source
> >>> address and interface to use when making a connection. Currently,
> >>> there is no way to tell from userspace where a connection was made.
> >>> This patch exposes the actual source address using a new field named
> >>> "src_addr=" in the "address" attribute.
> >>>
> >>> This is needed to diagnose and identify connectivity issues. With
> >>> the source address we can infer the interface associated with each
> >>> connection.
> >>>
> >>> This was tested with nvme-cli 2.0 to verify it does not have any
> >>> adverse effect. The new "src_addr=" field will simply be displayed
> >>> in the output of the "list-subsys" or "list -v" commands as shown here.
> >>>
> >>> $ nvme list-subsys
> >>> nvme-subsys0 - NQN=nqn.2014-08.org.nvmexpress.discovery
> >>> \
> >>> +- nvme0 tcp
> >>> traddr=192.168.56.1,trsvcid=8009,src_addr=192.168.56.101 live
> >>
> >> I wonder how this works with udev events etc for auto-connect? and
> >> also for uniqueness checks...
> >
> > Udev events are triggered by device properties (which are not the same as
> attributes).
> > The properties are found in the attribute "uevent". For example:
> >
> > $ cat /sys/class/nvme/nvme1/uevent
> > MAJOR=238
> > MINOR=1
> > DEVNAME=nvme1
> > NVME_TRTYPE=tcp
> > NVME_TRADDR=::ffff:192.168.56.1
> > NVME_TRSVCID=8009
> > NVME_HOST_TRADDR=none
> > NVME_HOST_IFACE=enp0s8
> >
> > The change I'm proposing does not affect device properties and therefore
> no impact on udev rules.
> >
> > With regards to uniqueness checks, there is only one place in libnvme where
> we read and check the "address" attribute (file: tree.c, function:
> nvme_ctrl_alloc). The code will simply skip "src_address=" and extract all the
> other known keys from "address".
>
> I'm concerned that adding this unconditionally may trigger someone to fail
> due to this addition... It is also confusing when host_traddr is provided...
That would happen if people are testing the "address" attribute as a "string".
In that case, I would say it's a bug on their part because there is no guarantee
that all the key=value pairs in the "address" string will always be in the same
order. By the way, I added a new key (host_iface) to the "address" attribute
over a year ago (May 2021) and have not seen any negative effect. Yes, it can
be confusing when host_traddr is provided, but in that case src_addr and
host_traddr will be equal. In fact, that's the reason why Hannes suggested
only using the host_addr and not introducing a new key (src_addr).
>
> If this would be logged as pr_dbg instead, would this achieve the goal?
Not really. Yes a log would help in the case someone is debugging a connectivity
issue in real time while looking at the logs, but what if the system has been up
for days? I know that the syslog can keep several days (weeks) worth of info
but that's not the same as having an attribute you can look at directly.
Also, an attribute is needed for the nvme-stas application I'm working on.
An attribute allows nvme-stas to run a complete audit of all existing connections.
Not just the configuration parameters, but where each connections is actually
made.
Just this morning I was debugging an issue where an I/O controller connection
was made on the management interface (default route) instead of going over
the interface dedicated for subsystem connections. This happened because the
user forgot to add explicit routes for the subsystems. By having an attribute
to expose where a connection is made we can then provide tools to help user
diagnose connectivity issues by themselves instead of submitting tickets
requiring the developer's help. That's the main goal for this change.
More information about the Linux-nvme
mailing list