Request timeout seen with NVMEoF TCP

Wunderlich, Mark mark.wunderlich at intel.com
Tue Dec 15 13:30:49 EST 2020


> I think we still have a race here with the following:
> 1. queue_rq sends h2cdata PDU (no data)
> 2. host receives r2t - prepares data PDU to send and schedules io_work 3. queue_rq sends another h2cdata PDU - ends up sending (2) because it was queued before it 4. io_work starts, loops but never able to acquire > the send_mutex - eventually just ends (dosn't requeue) 5. (3) completes, now nothing will send (2)

> We can either schedule the io_work from the direct send path, but that is less efficient than just trying to drain the send queue in the direct send path and if not all was sent, the write_space callback will trigger it.

Wouldn't the addition of the change to io_work() itself result in step (4) above never occurring?  Pending always being set if mutex can not be acquired, and if io_work() then exceeds time period it always re-queues itself.  So io_work() should always end up draining any send list eventually.  Unless io_work() exits early for some reason via the try_recv (ret < 0) return point without requeuing.

Or, can there be a case where there are sends through queue_rq where 'last' is false (and inline send conditions fail), but there is no subsequent send with 'last' true to schedule io_work()?  Might try changing queue_request() to always queue io_work() if inline send fails (not looking at last.  And add the second part of your patch for io_work() to set pending to true.  Also trap if ret<0 case above ever happens (or do break instead of return).  Re-run and see if failure still occurs.  Just a thought.



More information about the Linux-nvme mailing list