[PATCH 15/18] nvmet-tcp: enable TLS handshake upcall

Thu Mar 23 00:44:00 PDT 2023

Hey Hannes,

>>> To be precise: that's the reasoning I've come up with.
>>> Might be completely wrong, or beside the point.
>>> But what I _do_ know is that without this call userspace would
>>> _not_ see any data, and would happily sit there until we close the 
>>> socket due to a timeout (which also why I put in the timeout
>>> module options ...), and only _then_ see the data.
>>> But with this call everything works. So there.
>>
>> I may be missing something, and maybe its because I haven't looked at
>> the userspace side yet. But if it calls recvmsg, it should either
>> consume data, or it should block until it arrives.
>>
>> I'm finding it difficult to understand why emulating almost 100% the
>> default socket .data_ready is what is needed. But I'll try to understand
>> better.
>>
> Oh, I don't claim to have a full understanding, either.
> I had initially assumed that it would work 'out of the box', without us 
> having to specify a callback.
> Turns out that it doesn't.
> Maybe it's enough to call sk->sk_data_ready() after the upcall has been 
> started. But we sure have to do _something_.
> 
> If you have other solutions I'm all ears...

See my response to Chuck.

>>
>>>>> As outlined in the response to the nvme-tcp upcall, on the server 
>>>>> side we _have_ to allow for non-TLS connections (eg. for discovery).
>>>>
>>>> But in essence what you are doing is that you allow normal connections
>>>> for a secured port...
>>>>
>>> Correct.
>>
>> That is not what the feature is supposed to do.
>>
> Well ... it's a difference in interpretation.
> The NVMe TCP spec just says '... describes whether TLS is supported.'.
> It does not say 'required'.
> But there currently is no way how the server could describe 'TLS 
> supported' vs 'TLS required'.

But it doesn't make sense. If a subsystem wants to expose itself via
tls and non-tls, it needs to use 2 different ports, Which is a perfectly
reasonable requirement. Making TLS advisory misses the point, regardless
of the spec language.

Having support for advisory TLS is essentially implementing a permissive
mode, and that needs to be explicitly enabled (assuming we want to
support that), not the other way around.

>>>> btw, why not enforce a psk for the discovery controller (on this port)
>>>> as well? for secured ports? No one said that we must accept a
>>>> non-secured discovery connecting host on a secured port.
>>>>
>>> Because we can't have a PSK for standard discovery.
>>> Every discovery controller has the canonical discovery NQN, so every
>>> PSK will have the same subsystem NQN, and you will have to provide 
>>> the _same_ PSK to every attached storage system.
>>
>> Right, because they have the same NQN. Sounds good, what is the problem?
>> It is much better than to have sectype be nothing more than a
>> recommendation.
>>
>> If sectype tls1.x is defined on a port then every connection to that
>> port is a tls connection.
>>
> See above. Only if we decide to give TSAS a 'TLS required' meaning.
> With the this patchset it's a 'TLS supported' syntax.

Yes, tls should be a hard enforcement, regardless of the spec language.
The only way that tls would be a soft enforcement, is via an explicit
permissive-mode setting.

>>> If we had unique discovery controller NQNs everything would be good,
>>> but as we don't I guess we have to support normal connections in 
>>> addition to secured ones.
>>
>> Its orthogonal. the well-known disc-nqn may be "weaker", but tls
>> is still enforced. when we add unique disc-nqn, we can use different
>> psks.
>>
> Not only weaker, but _identical_ for each server.

I don't see any issue with that. I do see an issue with making tls
advisory universally in nvmet.

>>> As I said; maybe it's time to revisit the unique discovery NQN patches;
>>> with those we should be able to separate the various use-case like
>>> having a dedicated secured discovery port.
>>
>> I don't see why this would be needed.
>>
> We don't if we treat TSAS as 'TLS required', true.
> But that's not what the spec says.
> And there is also the dodgy statement in section 3.6.1.6:
>  > If a host that supports TLS for NVMe/TCP receives a discovery
>  > log entry indicating that the NVM subsystem uses NVMe/TCP and
>  > does not support TLS, then the host should nonetheless
>  > attempt to establish an NVMe/TCP connection that uses TLS.

First of all, "should" != "shall", so we are not obligated to do
this if it doesn't make sense for Linux, but regardless that is fine,
the host can fallback to non-tls in such a case if we choose to
implement this.

> which really indicates that the TSAS field is a recommendation only.
> 
> But really, we shouldn't read too much into what the TSAS field says.
> With the upcoming TPAR for TLS clarification all these different 
> interpretations should be clarified.
> 
> The key question, however, will remain: Do we _want_ to support a 'TLS 
> supported' mode for nvmet?
> I'd say we should, to have compability with older (client) installations.

That is great, but only with an explicit permissive port setting that
needs to be enabled, and later on disabled (i.e. when all the hosts get
upgraded over time).

> And I guess we have to anyway for secure concatenation
> (as this _requires_ that we start off unencrypted).

That I can understand. But it definitely should not mean that we
do a universally permissive tls.

> So I'd suggest to leave it for now, have the code to support both, and 
> wait for the TPAR to be ratified and then revisit this issue.

This is not a matter of a TPAR. It is about Linux. It doesn't make sense
to make tls universally permissive for _Linux_. I think we should
introduce code needed to support for secure concatenation only when
it is actually introduced.