[PATCH v15 05/20] nvme-tcp: Add DDP offload control path

Wed Sep 13 03:46:45 PDT 2023

On 9/13/23 12:10, Aurelien Aptel wrote:
> Sagi Grimberg <sagi at grimberg.me> writes:
>>> +     if (test_bit(NVME_TCP_Q_OFF_DDP, &queue->flags))
>>> +             nvme_tcp_unoffload_socket(queue);
>>> +#ifdef CONFIG_ULP_DDP
>>> +     if (nvme_tcp_admin_queue(queue) && queue->ctrl->ddp_netdev) {
>>> +             /* put back ref from get_netdev_for_sock() */
>>> +             dev_put(queue->ctrl->ddp_netdev);
>>> +             queue->ctrl->ddp_netdev = NULL;
>>> +     }
>>> +#endif
>>
>> Lets avoid spraying these ifdefs in the code.
>> the ddp_netdev struct member can be lifted out of the ifdef I think
>> because its only controller-wide.
>>
> 
> Ok, we will remove the ifdefs.
> 
>>> +#ifdef CONFIG_ULP_DDP
>>> +             /*
>>> +              * Admin queue takes a netdev ref here, and puts it
>>> +              * when the queue is stopped in __nvme_tcp_stop_queue().
>>> +              */
>>> +             ctrl->ddp_netdev = get_netdev_for_sock(queue->sock->sk);
>>> +             if (ctrl->ddp_netdev) {
>>> +                     if (nvme_tcp_ddp_query_limits(ctrl)) {
>>> +                             nvme_tcp_ddp_apply_limits(ctrl);
>>> +                     } else {
>>> +                             dev_put(ctrl->ddp_netdev);
>>> +                             ctrl->ddp_netdev = NULL;
>>> +                     }
>>> +             } else {
>>> +                     dev_info(nctrl->device, "netdev not found\n");
>>
>> Would prefer to not print offload specific messages in non-offload code
>> paths. at best, dev_dbg.
> 
> Sure, we will switch to dev_dbg.
> 
>> If the netdev is derived by the sk, why does the interface need a netdev
>> at all? why not just pass sk and derive the netdev from the sk behind
>> the interface?
>>
>> Or is there a case that I'm not seeing here?
> 
> If we derive the netdev from the socket, it would be too costly to call
> get_netdev_for_sock() which takes a lock on the data path.
> 
> We could store it in the existing sk->ulp_ddp_ctx, assigning it in
> sk_add and accessing it in sk_del/setup/teardown/resync.
> But we would run into the problem of not being sure
> get_netdev_for_sock() returned the same device in query_limits() and
> sk_add() because we did not keep a pointer to it.
> 
> We believe it would be more complex to deal with these problems than to
> just keep a reference to the netdev in the nvme-tcp controller.
> 

OK, Seems though that the netdev and the limits are bundled together,
meaning that you either get both or none.

Perhaps you should bundle them together:
	ctrl->ddp_netdev = nvme_tcp_get_ddp_netdev_with_limits(ctrl);
	if (ctrl->ddp_netdev)
		nvme_tcp_ddp_apply_ctrl_limits(ctrl);

where:
static struct net_device* nvme_tcp_get_ddp_netdev_with_limits(struct 
nvme_tcp_ctrl *ctrl)
{
	if (!ddp_offload)
		return NULL;
	netdev = get_netdev_for_sock(ctrl->queues[0].sock->sk);
	if (!netdev)
		return NULL;
	ret = ulp_ddp_query_limits(netdev, &ctrl->ddp_limits,
				ULP_DDP_NVME, ULP_DDP_C_NVME_TCP_BIT,
				false /* tls */);
	if (ret) {
		dev_put(netdev);
		return NULL;
	}
	return netdev;
}

And perhaps its time to introduce nvme_tcp_stop_admin_queue()?
static void nvme_tcp_stop_admin_queue(struct nvme_ctrl *nctrl)
{
	struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);

	nvme_tcp_stop_queue(nctrl, 0);
	dev_put(ctrl->ddp_netdev);
}