[PATCH] nvme-tcp: send H2CData PDUs based on MAXH2CDATA

Sagi Grimberg sagi at grimberg.me
Thu Nov 18 01:10:20 PST 2021


>>> @@ -1022,14 +1056,26 @@ static int nvme_tcp_try_send_data_pdu(struct nvme_tcp_request *req)
>>>   	struct nvme_tcp_data_pdu *pdu = req->pdu;
>>>   	u8 hdgst = nvme_tcp_hdgst_len(queue);
>>>   	int len = sizeof(*pdu) - req->offset + hdgst;
>>> +	int flags = MSG_DONTWAIT | MSG_MORE;
>>>   	int ret;
>>>   	if (queue->hdr_digest && !req->offset)
>>>   		nvme_tcp_hdgst(queue->snd_hash, pdu, sizeof(*pdu));
>>> -	ret = kernel_sendpage(queue->sock, virt_to_page(pdu),
>>> -			offset_in_page(pdu) + req->offset, len,
>>> -			MSG_DONTWAIT | MSG_MORE | MSG_SENDPAGE_NOTLAST);
>>> +	if (req->rem_r2t_len) {
>>> +		struct msghdr msg = { .msg_flags = flags };
>>> +		struct kvec iov = {
>>> +			.iov_base = (u8 *)pdu + req->offset,
>>> +			.iov_len = len
>>> +		};
>>> +
>>> +		ret = kernel_sendmsg(queue->sock, &msg, &iov, 1, iov.iov_len);
>>> +	} else {
>>> +		ret = kernel_sendpage(queue->sock, virt_to_page(pdu),
>>> +				      offset_in_page(pdu) + req->offset, len,
>>> +				      flags | MSG_SENDPAGE_NOTLAST);
>>> +	}
>>
>> Why is this needed? Seems out-of-place to me...
> 
> As per my understanding kernel_sendpage() does zero copy TX, return
> from kernel_sendpage() does not guarantee that buffer is transmitted
> or DMA read by NIC. Is this not correct?
> 
> If driver reuses same buffer for next H2CData PDU header it can
> corrupt previous H2CData PDU header.

Yes, please use sock_no_sendpage instead of kernel_sendmsg.



More information about the Linux-nvme mailing list