[PATCH libnvme] fabrics: don't pass sqflow_disable if the kernel does not support the option

Caleb Sander csander at purestorage.com
Mon Jul 24 09:01:06 PDT 2023


On Mon, Jul 24, 2023 at 8:35 AM Sagi Grimberg <sagi at grimberg.me> wrote:
>
>
> > Hi Sagi,
> >
> > On Mon, Jul 24, 2023 at 2:11 AM Sagi Grimberg <sagi at grimberg.me> wrote:
> >>
> >> Instead of retrying again without it.
> >>
> >> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> >> ---
> >>   src/nvme/fabrics.c | 12 +-----------
> >>   1 file changed, 1 insertion(+), 11 deletions(-)
> >>
> >> diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
> >> index fd761d10d0cb..2a9d8dc7f23a 100644
> >> --- a/src/nvme/fabrics.c
> >> +++ b/src/nvme/fabrics.c
> >> @@ -1026,7 +1026,7 @@ nvme_ctrl_t nvmf_connect_disc_entry(nvme_host_t h,
> >>                  return NULL;
> >>          }
> >>
> >> -       if (e->treq & NVMF_TREQ_DISABLE_SQFLOW)
> >> +       if (e->treq & NVMF_TREQ_DISABLE_SQFLOW && h->r->options->disable_sqflow)
> >
> > How do we know options is initialized here? It looks like that's done
> > in __nvmf_supported_options(), called from nvmf_add_ctrl(), which is
> > called a few lines later in this function.
>
> This is for connecting discovery log entries, so the kernel cap was
> already read once for the discovery controller.
>
> Although granted, this can be udev driven on an existing discovery
> controller. Probably need to do it regardless of adding a controller.

Yes, I would avoid depending on the assumption that a connection has
already been made, especially since libnvme is designed to be a
standalone library, not necessarily used by nvme-cli.

>
> I remember when adding the capability to check supported opts I added
> an interface to do it per feature. Not sure why it was changed, it is
> much more sane to just read it on demand given that there is just a
> simple and short string copy in the kernel...
>
> Ideally this would become:
>         if (e->treq & NVMF_TREQ_DISABLE_SQFLOW &&
>             nvmf_check_cap("disable_sqflow"))
>                 c->cfg.disable_sqflow = true;
>         else
>                 c->cfg.disable_sqflow = false;
>
> Daniel? Any reason why we are caching these opts in an executable
> with a short lifetime? And apparently we need to be very precise when
> we rely on this cache to be stable.

I agree an interface like that to get the supported options on demand
would be much more ergonomic. I think the current design makes sense
as long the options are only needed in build_options(). I do think
it's good to minimize system calls to get this information; making one
for each option passed would be a significant overhead. The
type-checking provided by the nvme_fabric_options field names also
seems beneficial over plain string comparisons. But we could easily
add a nvmf_check_cap() wrapper around __nvmf_supported_options(),
since __nvmf_supported_options() already caches the supported options.
(I would personally prefer a name like "nvmef_option_supported" since
"capability" has a different meaning in the NVMe specifications.)

#define nvmf_option_supported(r, tok) \
        (!__nvmf_supported_options(r) && (r)->options->tok)


>
> If you prefer to cache it, at least do it in the start of every
> fabrics callback so we don't need to understand when this cache
> is valid...
>
> >
> >>                  c->cfg.disable_sqflow = true;
> >>
> >>          if (e->trtype == NVMF_TRTYPE_TCP &&
> >> @@ -1038,16 +1038,6 @@ nvme_ctrl_t nvmf_connect_disc_entry(nvme_host_t h,
> >>          if (!ret)
> >>                  return c;
> >>
> >> -       if (errno == EINVAL && c->cfg.disable_sqflow) {
> >> -               errno = 0;
> >> -               /* disable_sqflow is unrecognized option on older kernels */
> >
> > The kernel didn't add support for reporting its supported options
> > until 5.17. If the kernel doesn't report this information, we fall
> > back to default_supported_options, which represents the options the
> > kernel supported when the reporting mechanism was added. These include
> > disable_sqflow, so h->r->options->disable_sqflow will always be true
> > (unless support for it is removed in the future), even if the kernel
> > is an old version that doesn't support it. So if this disable_sqflow
> > fallback behavior is necessary for older kernels, I don't think it can
> > safely be replaced by the h->r->options->disable_sqflow check.
>
> You right, it needs an else case, so we always set it based on the
> default/treq _and_ what the kernel supports.
>
> I will hold off with this patch until Daniel will reply to what
> he thinks needs to be done here.



More information about the Linux-nvme mailing list