[PATCH 3/3] nvme: start keep-alive after admin queue setup
Hannes Reinecke
hare at suse.de
Mon Nov 20 07:05:21 PST 2023
On 11/20/23 15:25, Sagi Grimberg wrote:
>
>
> On 11/20/23 16:19, Hannes Reinecke wrote:
>> On 11/20/23 14:39, Sagi Grimberg wrote:
>>>
>>>> Setting up I/O queues might take quite some time on larger and/or
>>>> busy setups, so KATO might expire before all I/O queues could be
>>>> set up.
>>>> Fix this by start keep alive from the ->init_ctrl_finish() callback,
>>>> and stopping it when calling nvme_cancel_admin_tagset().
>>>
>>> If this is a fix, the title should describe the issue it is fixing, and
>>> the body should say how it is fixing it.
>>>
>>>> Signed-off-by: Hannes Reinecke <hare at suse.de>
>>>> ---
>>>> drivers/nvme/host/core.c | 6 +++---
>>>> drivers/nvme/host/fc.c | 6 ++++++
>>>> 2 files changed, 9 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>>> index 62612f87aafa..f48b4f735d2d 100644
>>>> --- a/drivers/nvme/host/core.c
>>>> +++ b/drivers/nvme/host/core.c
>>>> @@ -483,6 +483,7 @@ EXPORT_SYMBOL_GPL(nvme_cancel_tagset);
>>>> void nvme_cancel_admin_tagset(struct nvme_ctrl *ctrl)
>>>> {
>>>> + nvme_stop_keep_alive(ctrl);
>>>> if (ctrl->admin_tagset) {
>>>> blk_mq_tagset_busy_iter(ctrl->admin_tagset,
>>>> nvme_cancel_request, ctrl);
>>>
>>> There is a cross dependency here, now nvme_cancel_admin_tagset needs to
>>> have the keep-alive stopped first, which may be waiting on I/O, which
>>> needs to be cancelled...
>>>
>>> Keep in mind that kato can be arbitrarily long, and now this function
>>> may be blocked on this kato period.
>>>
>>> I also think that now the function is doing something that is more
>>> than simply cancelling the inflight admin tagset, as it is named.
>>>
>> Hmm. I could move it out of cancel_admin_tagset(). It means that I'll
>> have to touch each transport driver, but as I have to touch at least
>> fc anyway I guess it's okay.
>>
>>>> @@ -3200,6 +3201,8 @@ int nvme_init_ctrl_finish(struct nvme_ctrl
>>>> *ctrl, bool was_suspended)
>>>> clear_bit(NVME_CTRL_DIRTY_CAPABILITY, &ctrl->flags);
>>>> ctrl->identified = true;
>>>> + nvme_start_keep_alive(ctrl);
>>>> +
>>>
>>> I'm fine with moving it here. But instead, maybe just change
>>> nvme_start_keep_alive() to use a zero delay and keep it where it
>>> is? will that help?
>>>
>> Not really. We still will fail if setting up I/O queues takes longer
>> than the KATO period.
>
> Why? Is this specific for non-tbkas? or very short kato?
Non-tbkas.
> because connect is effectively a command that should be counted as
> indication that the controller is fine.
>
Yeah, indeed.
> If you are not able to connect even a single queue, I'd say something
> is wrong and it could be argued that you _want_ to fail.
This is in conjunction with authentication, where some controller
implementations use external entities to calculate and/or validate the
DH parameters. Which takes time.
Scenario is that the admin queue connects fine, but the I/O queues take
some time to setup, until eventually KATO fires before all queues are
established.
As you said, TBKAS should avoid this scenario, but non-TBKAS is still
a valid setup.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare at suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Ivo Totev, Andrew McDonald,
Werner Knoblich
More information about the Linux-nvme
mailing list