[PATCH 0/2] nvme: sanitize KATO handling

Chao Leng lengchao at huawei.com
Wed Feb 24 02:20:57 EST 2021



On 2021/2/24 15:06, Hannes Reinecke wrote:
> On 2/24/21 7:42 AM, Chao Leng wrote:
>>
>>
>> On 2021/2/23 20:07, Hannes Reinecke wrote:
>>> Hi all,
>>>
>>> one of our customer had been running into a deadlock trying to terminate
>>> outstanding KATO commands during reset.
>>> Looking closer at it, I found that we never actually _track_ if a KATO
>>> command is submitted, so we might happily be sending several KATO commands
>>> to the same controller simultaneously.
>> Can you explain how can send KATO commands simultaneously?
> 
> Sure.
> Call nvme_start_keep_alive() on a dead connection.
> Just _after_ the KATO request has been sent,
> call nvme_start_keep_alive() again.
Call nvme_start_keep_alive() again? why?
Now just nvme_start_ctrl call nvme_start_keep_alive().
The ka_work will be canceled sync before start reconnection.
Did I miss something?
> 
> You now have an expired KATO command, and the new KATO command, both are active and sent to the controller.
> 
> Cheers,
> 
> Hannes



More information about the Linux-nvme mailing list