[PATCH 10/18] nvme-tcp: fixup send workflow for kTLS

Sagi Grimberg sagi at grimberg.me
Mon Apr 3 15:36:12 PDT 2023



On 4/3/23 21:48, Jakub Kicinski wrote:
> On Mon, 3 Apr 2023 18:51:09 +0300 Sagi Grimberg wrote:
>> What I'm assuming that Hannes is tripping on is that tls does
>> not accept when this flag is sent to sock_no_sendpage, which
>> is simply calling sendmsg. TLS will not accept this flag when
>> passed to sendmsg IIUC.
>>
>> Today the rough logic in nvme send path is:
>>
>> 	if (more_coming(queue)) {
>> 		flags = MSG_MORE | MSG_SENDPAGE_NOTLAST;
>> 	} else {
>> 		flags = MSG_EOR;
>> 	}
>>
>> 	if (!sendpage_ok(page)) {
>> 		kernel_sendpage();
>> 	} else {
>> 		sock_no_sendpage();
>> 	}
>>
>> This pattern (note that sock_no_sednpage was added later following bug
>> reports where nvme attempted to sendpage a slab allocated page), is
>> perfectly acceptable with normal sockets, but not with TLS.
>>
>> So there are two options:
>> 1. have tls accept MSG_SENDPAGE_NOTLAST in sendmsg (called from
>>      sock_no_sendpage)
>> 2. Make nvme set MSG_SENDPAGE_NOTLAST only when calling
>>      kernel_sendpage and clear it when calling sock_no_sendpage
>>
>> If you say that MSG_SENDPAGE_NOTLAST must be cleared when calling
>> sock_no_sendpage and it is a bug that it isn't enforced for normal tcp
>> sockets, then we need to change nvme, but I did not find
>> any documentation that indicates it, and right now, normal sockets
>> behave differently than tls sockets (wrt this flag in particular).
>>
>> Hope this clarifies.
> 
> Oh right, it does, the context evaporated from my head over the weekend.
> 
> IMHO it's best if the caller passes the right flags. The semantics of
> MSG_MORE vs NOTLAST are quite murky and had already caused bugs in the
> past :(
> 
> See commit d452d48b9f8b ("tls: prevent oversized sendfile() hangs by
> ignoring MSG_MORE")

Well, that is fine with me. This may change anyways with
the new MSG_SPLICE_PAGES from David.

> Alternatively we could have sock_no_sendpage drop NOTLAST to help
> all protos. But if we consider sendfile behavior as the standard
> simply clearing it isn't right, it should be a:
> 
> 	more = (flags & (MORE | NOTLAST)) == MORE | NOTLAST
> 	flags &= ~(MORE | NOTLAST)
> 	if (more)
> 		flags |= MORE

I don't think this would be the best option. Requiring callers
to clear NOTLAST if not calling sendpages is reasonable, but we
need to have this consistent. And also fix this pattern for the
rest of the kernel socket consumers that use this flag.



More information about the Linux-nvme mailing list