[PATCH AUTOSEL 6.17-6.1] nvme: Use non zero KATO for persistent discovery connections
Sasha Levin
sashal at kernel.org
Thu Oct 9 08:55:07 PDT 2025
From: Alistair Francis <alistair.francis at wdc.com>
[ Upstream commit 2e482655019ab6fcfe8865b62432c6d03f0b5f80 ]
The NVMe Base Specification 2.1 states that:
"""
A host requests an explicit persistent connection ... by specifying a
non-zero Keep Alive Timer value in the Connect command.
"""
As such if we are starting a persistent connection to a discovery
controller and the KATO is currently 0 we need to update KATO to a non
zero value to avoid continuous timeouts on the target.
Signed-off-by: Alistair Francis <alistair.francis at wdc.com>
Reviewed-by: Hannes Reinecke <hare at suse.de>
Reviewed-by: Christoph Hellwig <hch at lst.de>
Signed-off-by: Keith Busch <kbusch at kernel.org>
Signed-off-by: Sasha Levin <sashal at kernel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes
- Aligns Linux host behavior with NVMe Base Spec 2.1 requirement that
a host must specify a non-zero Keep Alive Timer in the Connect
command to request an explicit persistent discovery connection. The
previous behavior left `KATO=0` for discovery controllers even when
the connection became persistent, causing targets to time out and
drop connections.
- Code change and behavior
- In `nvme_start_ctrl()` (`drivers/nvme/host/core.c:4998`), on
reconnect for discovery controllers
(`test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags)` and
`nvme_discovery_ctrl(ctrl)`), the patch:
- Checks if `ctrl->kato` is zero.
- If zero, calls `nvme_stop_keep_alive(ctrl)`, sets `ctrl->kato =
NVME_DEFAULT_KATO`, then `nvme_start_keep_alive(ctrl)`.
- Still sends the rediscover uevent: `nvme_change_uevent(ctrl,
"NVME_EVENT=rediscover")`.
- This immediately starts keep-alive commands after a persistent
discovery reconnect and ensures subsequent Connect commands
advertise non-zero KATO.
- Why this is correct and effective
- Immediate effect: Even if the just-completed Connect used `kato=0`,
forcing a non-zero `kato` here starts the host keep-alive work right
away, avoiding target keep-alive timeouts after a persistent
reconnect.
- Future connections: `nvmf_connect_cmd_prep()` sets Connect’s KATO
from `ctrl->kato` (`drivers/nvme/host/fabrics.c:426`). With this
change, the next reconnection will send a non-zero KATO in the
Connect command as the spec requires.
- Safe sequence: `nvme_stop_keep_alive()` is a no-op when `kato==0`
(`drivers/nvme/host/core.c:1412`), then `ctrl->kato` is set to
`NVME_DEFAULT_KATO` (`drivers/nvme/host/nvme.h:31`), and
`nvme_start_keep_alive()` only schedules work when `kato!=0`
(`drivers/nvme/host/core.c:1404`).
- Scope and risk
- Scope-limited: Only affects discovery controllers on reconnect
(persistent discovery) and only when `kato==0`. No effect on:
- Non-discovery (I/O) controllers (they already default to non-zero
KATO).
- Discovery controllers where userspace explicitly set a non-zero
KATO.
- No architectural changes; uses existing helpers and flags; no ABI
change.
- Regression risk is low. Prior history already introduced persistent
discovery semantics and a sysfs `kato` attribute, and transports
already honor `ctrl->kato` for Connect. This change simply fills a
corner case where `kato` remained zero in a persistent discovery
reconnect.
- Historical context and consistency
- 2018: We explicitly avoided KA to discovery controllers per early
spec constraints.
- 2021: The code was adjusted so discovery controllers default to
`kato=0`, while I/O controllers default to `NVME_DEFAULT_KATO`
(commit 32feb6de). Persistent discovery connections were intended to
have a positive KATO (via options), but implicit persistent
reconnects could still have `kato=0`.
- 2022: Added rediscover uevent for persistent discovery reconnects
(f46ef9e87) and `NVME_CTRL_STARTED_ONCE` usage.
- This patch completes the intent by ensuring persistent discovery
reconnects run with non-zero KATO automatically, preventing target
timeouts and complying with spec 2.1.
- Stable backport suitability
- Fixes a user-visible bug (target timeouts and unstable discovery
connectivity on persistent reconnects).
- Small, self-contained change confined to `nvme_start_ctrl()` in
`drivers/nvme/host/core.c`.
- No new features or interfaces; minimal risk of regression; behavior
matches spec and existing design.
- Dependencies exist in stable trees that already have persistent
discovery support and the `NVME_CTRL_STARTED_ONCE` mechanism. For
older branches that still use `test_and_set_bit` in the rediscover
path, the logic remains valid within that conditional block.
- Side notes for backporters
- Ensure the tree has `NVME_CTRL_STARTED_ONCE`,
`nvme_discovery_ctrl()`, and the rediscover uevent path in
`nvme_start_ctrl()`. If an older stable branch uses
`test_and_set_bit` instead of `test_bit`, place the new KATO block
inside that existing conditional.
- `nvmf_connect_cmd_prep()` must already populate Connect’s `kato`
from `ctrl->kato` (`drivers/nvme/host/fabrics.c:426`) so that future
reconnects benefit from the updated `kato`.
drivers/nvme/host/core.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6b7493934535a..5714d49932822 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4990,8 +4990,14 @@ void nvme_start_ctrl(struct nvme_ctrl *ctrl)
* checking that they started once before, hence are reconnecting back.
*/
if (test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
- nvme_discovery_ctrl(ctrl))
+ nvme_discovery_ctrl(ctrl)) {
+ if (!ctrl->kato) {
+ nvme_stop_keep_alive(ctrl);
+ ctrl->kato = NVME_DEFAULT_KATO;
+ nvme_start_keep_alive(ctrl);
+ }
nvme_change_uevent(ctrl, "NVME_EVENT=rediscover");
+ }
if (ctrl->queue_count > 1) {
nvme_queue_scan(ctrl);
--
2.51.0
More information about the Linux-nvme
mailing list