[PATCH 10/18] nvme-tcp: fixup send workflow for kTLS

Sagi Grimberg sagi at grimberg.me
Mon Apr 3 08:51:09 PDT 2023


>>>> Some of the flags are call specific, others may be internal to the
>>>> networking stack (e.g. the DECRYPTED flag). Old protocols didn't do
>>>> any validation because people coded more haphazardly in the 90s.
>>>> This lack of validation is a major source of technical debt :(
>>>
>>> A-ha. So what is the plan?
>>> Should the stack validate flags?
>>> And should the rules for validating be the same for all protocols?
>>
>> MSG_SENDPAGE_NOTLAST is not an internal flag, I thought it was
>> essentially similar semantics to MSG_MORE but for sendpage. It'd
>> be great if this can be allowed in tls (again, at the very least
>> don't fail but continue as if it wasn't passed).
> 
> .. but.. MSG_SENDPAGE_NOTLAST is supported in TLS, isn't it?
> Why are we talking about it?

Ah, right.

What I'm assuming that Hannes is tripping on is that tls does
not accept when this flag is sent to sock_no_sendpage, which
is simply calling sendmsg. TLS will not accept this flag when
passed to sendmsg IIUC.

Today the rough logic in nvme send path is:

	if (more_coming(queue)) {
		flags = MSG_MORE | MSG_SENDPAGE_NOTLAST;
	} else {
		flags = MSG_EOR;
	}

	if (!sendpage_ok(page)) {
		kernel_sendpage();
	} else {
		sock_no_sendpage();
	}

This pattern (note that sock_no_sednpage was added later following bug
reports where nvme attempted to sendpage a slab allocated page), is
perfectly acceptable with normal sockets, but not with TLS.

So there are two options:
1. have tls accept MSG_SENDPAGE_NOTLAST in sendmsg (called from
    sock_no_sendpage)
2. Make nvme set MSG_SENDPAGE_NOTLAST only when calling
    kernel_sendpage and clear it when calling sock_no_sendpage

If you say that MSG_SENDPAGE_NOTLAST must be cleared when calling
sock_no_sendpage and it is a bug that it isn't enforced for normal tcp
sockets, then we need to change nvme, but I did not find
any documentation that indicates it, and right now, normal sockets
behave differently than tls sockets (wrt this flag in particular).

Hope this clarifies.



More information about the Linux-nvme mailing list