[PATCH] nvme-tcp: send H2CData PDUs based on MAXH2CDATA
Sagi Grimberg
sagi at grimberg.me
Thu Nov 18 01:10:20 PST 2021
>>> @@ -1022,14 +1056,26 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req)
>>> struct nvme_tcp_data_pdu *pdu = req->pdu;
>>> u8 hdgst = nvme_tcp_hdgst_len(queue);
>>> int len = sizeof(*pdu) - req->offset + hdgst;
>>> + int flags = MSG_DONTWAIT | MSG_MORE;
>>> int ret;
>>> if (queue->hdr_digest && !req->offset)
>>> nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu));
>>> - ret = kernel_sendpage(queue->sock, virt_to_page(pdu),
>>> - offset_in_page(pdu) + req->offset, len,
>>> - MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST);
>>> + if (req->rem_r2t_len) {
>>> + struct msghdr msg = { .msg_flags = flags };
>>> + struct kvec iov = {
>>> + .iov_base = (u8 *)pdu + req->offset,
>>> + .iov_len = len
>>> + };
>>> +
>>> + ret = kernel_sendmsg(queue->sock, &msg, &iov, 1, iov.iov_len);
>>> + } else {
>>> + ret = kernel_sendpage(queue->sock, virt_to_page(pdu),
>>> + offset_in_page(pdu) + req->offset, len,
>>> + flags | MSG_SENDPAGE_NOTLAST);
>>> + }
>>
>> Why is this needed? Seems out-of-place to me...
>
> As per my understanding kernel_sendpage() does zero copy TX, return
> from kernel_sendpage() does not guarantee that buffer is transmitted
> or DMA read by NIC. Is this not correct?
>
> If driver reuses same buffer for next H2CData PDU header it can
> corrupt previous H2CData PDU header.
Yes, please use sock_no_sendpage instead of kernel_sendmsg.
More information about the Linux-nvme
mailing list