[PATCH libnvme] fabrics: don't pass sqflow_disable if the kernel does not support the option

Sagi Grimberg sagi at grimberg.me
Mon Jul 24 08:35:55 PDT 2023


> Hi Sagi,
> 
> On Mon, Jul 24, 2023 at 2:11 AM Sagi Grimberg <sagi at grimberg.me> wrote:
>>
>> Instead of retrying again without it.
>>
>> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>> ---
>>   src/nvme/fabrics.c | 12 +-----------
>>   1 file changed, 1 insertion(+), 11 deletions(-)
>>
>> diff --git a/src/nvme/fabrics.c b/src/nvme/fabrics.c
>> index fd761d10d0cb..2a9d8dc7f23a 100644
>> --- a/src/nvme/fabrics.c
>> +++ b/src/nvme/fabrics.c
>> @@ -1026,7 +1026,7 @@ nvme_ctrl_t nvmf_connect_disc_entry(nvme_host_t h,
>>                  return NULL;
>>          }
>>
>> -       if (e->treq & NVMF_TREQ_DISABLE_SQFLOW)
>> +       if (e->treq & NVMF_TREQ_DISABLE_SQFLOW && h->r->options->disable_sqflow)
> 
> How do we know options is initialized here? It looks like that's done
> in __nvmf_supported_options(), called from nvmf_add_ctrl(), which is
> called a few lines later in this function.

This is for connecting discovery log entries, so the kernel cap was
already read once for the discovery controller.

Although granted, this can be udev driven on an existing discovery
controller. Probably need to do it regardless of adding a controller.

I remember when adding the capability to check supported opts I added
an interface to do it per feature. Not sure why it was changed, it is
much more sane to just read it on demand given that there is just a
simple and short string copy in the kernel...

Ideally this would become:
	if (e->treq & NVMF_TREQ_DISABLE_SQFLOW &&
	    nvmf_check_cap("disable_sqflow"))
		c->cfg.disable_sqflow = true;
	else
		c->cfg.disable_sqflow = false;

Daniel? Any reason why we are caching these opts in an executable
with a short lifetime? And apparently we need to be very precise when
we rely on this cache to be stable.

If you prefer to cache it, at least do it in the start of every
fabrics callback so we don't need to understand when this cache
is valid...

> 
>>                  c->cfg.disable_sqflow = true;
>>
>>          if (e->trtype == NVMF_TRTYPE_TCP &&
>> @@ -1038,16 +1038,6 @@ nvme_ctrl_t nvmf_connect_disc_entry(nvme_host_t h,
>>          if (!ret)
>>                  return c;
>>
>> -       if (errno == EINVAL && c->cfg.disable_sqflow) {
>> -               errno = 0;
>> -               /* disable_sqflow is unrecognized option on older kernels */
> 
> The kernel didn't add support for reporting its supported options
> until 5.17. If the kernel doesn't report this information, we fall
> back to default_supported_options, which represents the options the
> kernel supported when the reporting mechanism was added. These include
> disable_sqflow, so h->r->options->disable_sqflow will always be true
> (unless support for it is removed in the future), even if the kernel
> is an old version that doesn't support it. So if this disable_sqflow
> fallback behavior is necessary for older kernels, I don't think it can
> safely be replaced by the h->r->options->disable_sqflow check.

You right, it needs an else case, so we always set it based on the
default/treq _and_ what the kernel supports.

I will hold off with this patch until Daniel will reply to what
he thinks needs to be done here.



More information about the Linux-nvme mailing list