Request timeout seen with NVMEoF TCP
Potnuri Bharat Teja
bharat at chelsio.com
Wed Dec 16 00:51:23 EST 2020
On Monday, December 12/14/20, 2020 at 17:53:44 -0800, Sagi Grimberg wrote:
>
> > Hey Potnuri,
> >
> > Have you observed this further?
> >
> > I'd think that if the io_work reschedule itself when it races
> > with the direct send path this should not happen, but we may be
> > seeing a different race going on here, adding Samuel who saw
> > a similar phenomenon.
>
> I think we still have a race here with the following:
> 1. queue_rq sends h2cdata PDU (no data)
> 2. host receives r2t - prepares data PDU to send and schedules io_work
> 3. queue_rq sends another h2cdata PDU - ends up sending (2) because it was
> queued before it
> 4. io_work starts, loops but never able to acquire the send_mutex -
> eventually just ends (dosn't requeue)
> 5. (3) completes, now nothing will send (2)
>
> We can either schedule the io_work from the direct send path, but that
> is less efficient than just trying to drain the send queue in the
> direct send path and if not all was sent, the write_space callback
> will trigger it.
>
> Potnuri, does this patch solves what you are seeing?
Hi Sagi,
Below patch works fine. I have it running all night with out any issues.
Thanks.
> --
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 1ba659927442..1b4e25624ba4 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -262,6 +262,16 @@ static inline void nvme_tcp_advance_req(struct
> nvme_tcp_request *req,
> }
> }
>
> +static inline void nvme_tcp_send_all(struct nvme_tcp_queue *queue)
> +{
> + int ret;
> +
> + /* drain the send queue as much as we can... */
> + do {
> + ret = nvme_tcp_try_send(queue);
> + } while (ret > 0);
> +}
> +
> static inline void nvme_tcp_queue_request(struct nvme_tcp_request *req,
> bool sync, bool last)
> {
> @@ -279,7 +289,7 @@ static inline void nvme_tcp_queue_request(struct
> nvme_tcp_request *req,
> if (queue->io_cpu == smp_processor_id() &&
> sync && empty && mutex_trylock(&queue->send_mutex)) {
> queue->more_requests = !last;
> - nvme_tcp_try_send(queue);
> + nvme_tcp_send_all(queue);
> queue->more_requests = false;
> mutex_unlock(&queue->send_mutex);
> } else if (last) {
> @@ -1122,6 +1132,14 @@ static void nvme_tcp_io_work(struct work_struct *w)
> pending = true;
> else if (unlikely(result < 0))
> break;
> + } else {
> + /*
> + * submission path is sending, we need to
> + * continue or resched because the submission
> + * path direct send is not concerned with
> + * rescheduling...
> + */
> + pending = true;
> }
>
> result = nvme_tcp_try_recv(queue);
> --
More information about the Linux-nvme
mailing list