[PATCH] nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec

Thu Mar 5 02:49:52 PST 2026

On Thu Mar 5, 2026 at 4:01 AM CET, yunje shin wrote:
> Hi,
>
> When I wrote the patch, I followed the existing convention of calling
> nvmet_tcp_fatal_error() directly. But since the function returns void,
> having reviewed the AI agent's analysis, the concern seems to have
> some merit - the caller could overwrite rcv_state with
> NVMET_TCP_RECV_DATA after the failure, causing the state machine to
> proceed with an uninitialized cmd->recv_msg.msg_iter.
>
> Maurizio's approach of changing the return type to int and letting
> callers propagate the error makes sense. This also aligns with his
> earlier patch [1] to remove redundant nvmet_tcp_fatal_error() calls
> from lower-level functions.
>
> To confirm this, would the following sequence trigger the issue?
>
> 1. Send a crafted H2C Data PDU with an invalid data_offset or length
>    that causes sg_idx >= sg_cnt in nvmet_tcp_build_pdu_iovec()
> 2. The bounds check triggers -> nvmet_tcp_fatal_error() sets
>    rcv_state = NVMET_TCP_RECV_ERR -> early return (void)
> 3. Back in nvmet_tcp_handle_h2c_data_pdu(), the caller overwrites
>    rcv_state with NVMET_TCP_RECV_DATA (ignoring the failure)
> 4. The state machine proceeds to nvmet_tcp_try_recv_data(),
>    which calls recvmsg() using the uninitialized cmd->recv_msg.msg_iter

Yes, this is the sequence that should make the kernel crash.
I say "should" because right now I don't have a reproducer to confirm it 100%.

In any case, the code looks a bit fragile and I am going to submit a
patchset to clean it up.

It should be ready very soon.

Maurizio