[PATCHv2] nvme-tcp: Do not terminate i/o commands during RESETTING

Sun Jan 14 22:22:02 PST 2024

On 1/14/24 03:53, Max Gurtovoy wrote:
> Hi Hannes,
> 
> On 12/01/2024 12:09, hare at kernel.org wrote:
>> From: Hannes Reinecke <hare at suse.de>
>>
>> Terminating commands from the timeout handler might lead
>> to a data corruption as the timeout might trigger before
>> KATO expired.
> 
> Can you please explain the data corruption and how this patch is fixing 
> it ?
> 
It's like this:
n: connection breaks
n + 1: send command 1,2,3, tmo 30s
n + 2: send keep-alive, tmo 30s
n + 31: command timeout
  - command 1 timeout:
    queue error recovery workqueue,
    return BLK_EH_RESET_TIMER
  - command 2 timeout:
    abort command
    retry command
  - start error recovery workqueue, abort command 1&3
n + 32: KATO expires, commands are retried

Now command 2 was aborted directly, and will be retried
on a different path without waiting for KATO.
So command 2 will be sent to the controller while the controller
is not aware that a KATO timeout had happened.
This not only violates the spec (which states that we should only
retry commands after a KATO timeout happened), but there are
some controller implementations which need to clear up internal
state once KATO triggered. And on those implementations we see
a data corruption.

As discussed elsewhere these controllers really should be implement
CQAT, but that is no excuse for us violating the spec.

>> When several commands have been sent in a batch and
>> the command timeouts trigger just after the keep-alive
>> command has been sent then the first command will trigger
>> the error recovery. But all other commands will timeout
>> directly afterwards and will hit the timeout handler
>> before the err_work workqueue handler has started.
>> This results in these commands being aborted and
>> immediately retried without waiting for KATO.
>> So return BLK_EH_RESET_TIMER for I/O commands when
>> the controller is in 'RESETTING' or 'DELETING'
>> state to ensure that the commands will
>> be retried only after the KATO interval.
> 
> I'm not sure I understand how does KATO and reconnect_delay are related ?
> 
Not sure I'm following; reconnection delay doesn't come into it...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare at suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich