nvme tcp receive errors

Thu Apr 29 05:52:37 BST 2021

> The driver tracepoints captured millions of IO's where everything
> happened as expected, so I really think something got confused and
> mucked with the wrong request. I've added more trace points to increase
> visibility because I frankly didn't find how that could happen just from
> code inspection. We will also incorporate your patch below for the next
> recreate.

Keith, does the issue still happen with eliminating the network send
from .queue_rq() ?

--

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index eb1feaacd11a..b3fafa536345 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -288,7 +288,7 @@ static inline void nvme_tcp_queue_request(struct 
nvme_tcp_request *req,
          * directly, otherwise queue io_work. Also, only do that if we
          * are on the same cpu, so we don't introduce contention.
          */
-       if (queue->io_cpu == __smp_processor_id() &&
+       if (0 && queue->io_cpu == __smp_processor_id() &&
             sync && empty && mutex_trylock(&queue->send_mutex)) {
                 queue->more_requests = !last;
                 nvme_tcp_send_all(queue);
--