nvme_tcp BUG: unable to handle kernel NULL pointer dereference at 0000000000000230

Thu Jun 10 13:03:03 PDT 2021

> Correct, free_queue is being called (sock->sk becomes NULL) before restore_sock_calls
> 
> When restore_sock_calls is called, we fail on 'write_lock_bh(&sock->sk->sk_callback_lock)'
> 
> NULL pointer dereference at 0x230 → 560 decimal
> crash> struct sock -o
> struct sock {
>     [0] struct sock_common __sk_common;
>     …
>     ...
>     …
>     [560] rwlock_t sk_callback_lock;
> 
> stop queue in ctx2 does not really do anything since 'NVME_TCP_Q_LIVE' bit is already cleared (by ctx1).
> can you please explain how stop the queue before free helps to serialize ctx1 ?

What I understood from your description is:
1. ctx1 calls stop_queue - calls kernel_sock_shutdown
2. ctx1 gets to restore_sock_calls (just before)
3. ctx2 is triggered from state_change - scheduling err_work
4. ctx2 does stop_queues
5. ctx2 calls destroy_queues -> there does sock_release
6. ctx1 does frwd progress and access an already freed sk

Hence with the mutex protection, ctx2 will be serialized on step (4)
until ctx2 releases the mutex and hence cannot get to step (5) but
only after ctx1 releases the mutex, in step (6).

But maybe I'm not interpreting this correctly?