[PATCH 0/1] nvme-tcp: fence TCP socket on transport error

Mon Mar 20 15:09:46 PDT 2023

Even after "160f3549a907 nvme-tcp: fix UAF when detecting digest errors", I
don't think the queue->rd_enabled flag is fencing the TCP socket from further
receive processing.

io_work can be re-queued while running, and if it's pending when a receive
error occurs it will bypass the queue->rd_enabled checks that should prevent
queueing in nvme_tcp_data_ready.  Actually, it looks like nvme_tcp_write_space
and nvme_tcp_queue_request would schedule io_work anyway, which will always
call nvme_tcp_try_recv once regardless of rd_enabled.  And nvme_tcp_poll has no
checks against rd_enabled.

After receiving an unsupported PDU, the header has been read but the payload
remains in the socket queue.  And the nvme_tcp queue state looks like it's
ready to receive the payload of a C2H data PDU.  Any additional calls to
nvme_tcp_try_recv can incorerectly interpret the next bits as a command ID,
lookup a request from the tagset using this bogus ID, and start copying the
payload data from the unsupported PDU to an invalid destination address.

This has been seen with a buggy target that sent extranious bytes in the TCP
stream, but also I believe with a properly functioning target that sent a
Controller to Host Terminate Connection (C2HTermReq).  The Fatal Error Status
field was used as a Command ID and brought the host system down.

An additonal check against queue->rd_enabled at the start of nvme_tcp_recv_skb
should protect against both additonal io_work scheduling and nvme_tcp_poll use
after a receive transport error.

- Chris

Chris Leech (1):
  nvme-tcp: fence TCP socket on receive error

 drivers/nvme/host/tcp.c | 7 +++++++
 1 file changed, 7 insertions(+)

-- 
2.39.2