[PATCH v2] nvmet-tcp: Enforce update ordering between queue->cmd and rcv_state

Meir Elisha meir.elisha at volumez.com
Wed Feb 19 04:28:14 PST 2025



On 18/02/2025 17:19, Keith Busch wrote:
> On Mon, Feb 17, 2025 at 04:22:10PM +0200, Meir Elisha wrote:
>> The order in which queue->cmd and rcv_state are updated is crucial.
>> If these assignments are reordered by the compiler, the worker might not
>> get queued in nvmet_tcp_queue_response(), hanging the IO. to enforce the
>> the correct reordering, set rcv_state using smp_store_release().
>>
>> Signed-off-by: Meir Elisha <meir.elisha at volumez.com>
>> ---
>> v2: Change comments to c-style
>>
>>  drivers/nvme/target/tcp.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
>> index 7c51c2a8c109..49ce2f9ac6c8 100644
>> --- a/drivers/nvme/target/tcp.c
>> +++ b/drivers/nvme/target/tcp.c
>> @@ -848,7 +848,8 @@ static void nvmet_prepare_receive_pdu(struct nvmet_tcp_queue *queue)
>>  	queue->offset = 0;
>>  	queue->left = sizeof(struct nvme_tcp_hdr);
>>  	queue->cmd = NULL;
>> -	queue->rcv_state = NVMET_TCP_RECV_PDU;
>> +	/* Ensure rcv_state is visible only after queue->cmd is set */
>> +	smp_store_release(&queue->rcv_state, NVMET_TCP_RECV_PDU);
>>  }
>>  
>>  static void nvmet_tcp_free_crypto(struct nvmet_tcp_queue *queue)
>> @@ -1017,7 +1018,8 @@ static int nvmet_tcp_handle_h2c_data_pdu(struct nvmet_tcp_queue *queue)
>>  	cmd->pdu_recv = 0;
>>  	nvmet_tcp_build_pdu_iovec(cmd);
>>  	queue->cmd = cmd;
>> -	queue->rcv_state = NVMET_TCP_RECV_DATA;
>> +	/* Ensure rcv_state is visible only after queue->cmd is set */
>> +	smp_store_release(&queue->rcv_state, NVMET_TCP_RECV_DATA);
>>  
>>  	return 0;
>>  
>> -- 
>> 2.34.1
>>
>> This ordering is critical on weakly ordered architectures (such as ARM)
>> so that any observer which sees the new rcv_state is guaranteed to also
>> see the updated cmd. 
> 
> Something seems off if smp_store_release() isn't paired with
> smp_load_acquire(). Why does the reader side not need a barrier?

Hi Keith

Thanks for the reply. After reviewing the code again, I think there may
still be a race condition here.

consider the following, worker thread executed the request (queue->cmd->req.execute) and before
it regains execution,nvmet_tcp_queue_response() gets called from another context.
It passes the first if statement(queue->cmd == cmd) and just before evaluating the
second one(queue->state == NVMET_TCP_RECV_PDU) the worker executes again and sets both queue->cmd and rcv_state.
In that case, the second thread will mistakenly exit on the second if statement causing a hanging IO.
I will create another version that declares the cmd and the state as local variables in
nvmet_tcp_queue_response (using the read barrier and READ_ONCE) in an opposite order
which should enforce the correct ordering and fix the problem I've mentioned above.



More information about the Linux-nvme mailing list