[PATCHv6 0/5] net/tls: fixes for NVMe-over-TLS
Hannes Reinecke
hare at suse.de
Mon Jul 3 06:57:28 PDT 2023
On 7/3/23 15:42, Sagi Grimberg wrote:
>
>>>> Hannes Reinecke <hare at suse.de> wrote:
>>>>
>>>>>> 'discover' and 'connect' works, but when I'm trying to transfer data
>>>>>> (eg by doing a 'mkfs.xfs') the whole thing crashes horribly in
>>>>>> sock_sendmsg() as it's trying to access invalid pages :-(
>>>>
>>>> Can you be more specific about the crash?
>>>
>>> Hannes,
>>>
>>> See:
>>> [PATCH net] nvme-tcp: Fix comma-related oops
>>
>> Ah, right. That solves _that_ issue.
>>
>> But now I'm deadlocking on the tls_rx_reader_lock() (patched as to
>> your suggestion). Investigating.
>
> Are you sure it is a deadlock? or maybe you returned EAGAIN and nvme-tcp
> does not interpret this as a transient status and simply returns from
> io_work?
>
Unfortunately, yes.
static int tls_rx_reader_acquire(struct sock *sk, struct
tls_sw_context_rx *ctx,
bool nonblock)
{
long timeo;
timeo = sock_rcvtimeo(sk, nonblock);
while (unlikely(ctx->reader_present)) {
DEFINE_WAIT_FUNC(wait, woken_wake_function);
ctx->reader_contended = 1;
add_wait_queue(&ctx->wq, &wait);
sk_wait_event(sk, &timeo,
!READ_ONCE(ctx->reader_present), &wait);
and sk_wait_event() does:
#define sk_wait_event(__sk, __timeo, __condition, __wait) \
({ int __rc; \
__sk->sk_wait_pending++; \
release_sock(__sk); \
__rc = __condition; \
if (!__rc) { \
*(__timeo) = wait_woken(__wait, \
TASK_INTERRUPTIBLE, \
*(__timeo)); \
} \
sched_annotate_sleep(); \
lock_sock(__sk); \
__sk->sk_wait_pending--; \
__rc = __condition; \
__rc; \
})
so not calling 'lock_sock()' in tls_tx_reader_acquire() helps only _so_
much, we're still deadlocking.
Cheers,
Hannes
More information about the Linux-nvme
mailing list