[PATCH 06/18] nvme-tcp: call 'queue->data_ready()' in nvme_tcp_data_ready()

Sagi Grimberg sagi at grimberg.me
Wed Mar 22 01:08:24 PDT 2023



On 3/21/23 16:09, Hannes Reinecke wrote:
> On 3/21/23 14:44, Sagi Grimberg wrote:
>>
>>> Call the original data_ready() callback in nvme_tcp_data_ready()
>>> to avoid a receive stall.
>>
>> Can you please improve the description to include what is the stall?
>> For example, does the stall exist today? If it is, I would like to
>> separate such patches from this set and include them asap.
>>
> That is actually particular to the TLS implementation, as it uses the 
> 'data_ready' callback to produce the data which can be read by eg 
> recvmsg().
> 
> Without this call there's no data to peruse for recvmsg().
> 
> But I'm not _that_ deep into networking details to know whether this is 
> TLS specific or an issue with any data_ready callback.
> I assume the latter, but then again, who knows.

Seems that this is only relevant when nvme_tcp runs on top of a ulp,
so a code comment would probably make sense here.

> 
> Hence the slightly vague description.
> 
>>>
>>> Signed-off-by: Hannes Reinecke <hare at suse.de>
>>> ---
>>>   drivers/nvme/host/tcp.c | 8 +++++---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>> index 0e14b1b90855..0512eb289dcf 100644
>>> --- a/drivers/nvme/host/tcp.c
>>> +++ b/drivers/nvme/host/tcp.c
>>> @@ -936,12 +936,14 @@ static void nvme_tcp_data_ready(struct sock *sk)
>>>       trace_sk_data_ready(sk);
>>> -    read_lock_bh(&sk->sk_callback_lock);
>>> -    queue = sk->sk_user_data;
>>> +    rcu_read_lock_bh();
>>
>> Now I understand your comment from a previous patch.
>> Can you explain why is this convention needed?
>>
>> I would prefer to have it as a separate patch with an
>> explanation to why it is needed.
>>
> This is the slightly odd socket callback handling.
> Any driver is free to set the socket callbacks, but it has to be aware 
> that it might not be the only one in the stack doing so.
> So one has to be prepared that the callbacks are set already, so we 
> should be calling them prior to our callback.

I meant the change from read_lock_bh to rcu_read_lock_bh and the
same for rcu_dereference_sk_user_data. It needs to be in a
separate patch with explanation to why it is needed.

> 
>>> +    queue = rcu_dereference_sk_user_data(sk);
>>> +    if (queue->data_ready)
>>> +        queue->data_ready(sk);
>>
>> Is the tls data_ready call sync or async? just for general knowledge?
>>
>>
> Sync, I guess. Otherwise we wouldn't be needing the lock ...

OK.



More information about the Linux-nvme mailing list