[PATCH v2 19/20] nvme-tcp: stop auth work after tearing down queues in error recovery

Chaitanya Kulkarni chaitanyak at nvidia.com
Mon Nov 14 20:18:26 PST 2022


On 11/13/22 03:24, Sagi Grimberg wrote:
> when starting error recovery there might be a authentication work
> running, and it involves I/O commands. Given the controller is tearing
> down there is no chance for the I/O to complete other than timing out
> which may unnecessarily take a full io timeout.
> 
> So first tear down the queues, fail/cancel all inflight I/O (including
> potentially authentication) and only then stop authentication. This
> ensures that failover is not stalled due to blocked authentication I/O.
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---

It will be really cool to add error injection knob and with blktest
for this scenario...

Looks good.

Reviewed-by: Chaitanya Kulkarni <kch at nvidia.com>

-ck



More information about the Linux-nvme mailing list