[PATCH 2/2] nvmet: fix a race condition between release_queue and io_work

Sagi Grimberg sagi at grimberg.me
Tue Oct 26 08:42:02 PDT 2021


Hi Maurizio,

On 10/21/21 11:41 AM, Maurizio Lombardi wrote:
> If the initiator executes a reset controller operation while
> performing I/O, the target kernel will crash because of a race condition
> between release_queue and io_work;

Can you add the stack trace?

> nvmet_tcp_uninit_data_in_cmds() may be executed while io_work
> is running, calling flush_work(io_work) was not sufficient to
> prevent this because io_work could requeue itself.

OK, then this should be sufficient to fix it right?

> * Fix this bug by preventing io_work from being enqueued when
> sk_user_data is NULL (it means that the queue is going to be deleted)

This is triggered from the completion path, where the commands
are not in a state where they are still fetching data from the
host. How does this prevent the crash?

> * Ensure that all the memory allocated for the commands' iovec is freed

How is this needed to prevent a crash?



More information about the Linux-nvme mailing list