[PATCH AUTOSEL 6.18-6.12] nvme-fabrics: add ENOKEY to no retry criteria for authentication failures
Sasha Levin
sashal at kernel.org
Thu Dec 11 18:08:58 PST 2025
From: Justin Tee <justintee8345 at gmail.com>
[ Upstream commit 13989207ee29c40501e719512e8dc90768325895 ]
With authentication, in addition to EKEYREJECTED there is also no point in
retrying reconnects when status is ENOKEY. Thus, add -ENOKEY as another
criteria to determine when to stop retries.
Cc: Daniel Wagner <wagi at kernel.org>
Cc: Hannes Reinecke <hare at suse.de>
Closes: https://lore.kernel.org/linux-nvme/20250829-nvme-fc-sync-v3-0-d69c87e63aee@kernel.org/
Signed-off-by: Justin Tee <justintee8345 at gmail.com>
Tested-by: Daniel Wagner <wagi at kernel.org>
Reviewed-by: Daniel Wagner <wagi at kernel.org>
Reviewed-by: Hannes Reinecke <hare at suse.de>
Signed-off-by: Keith Busch <kbusch at kernel.org>
Signed-off-by: Sasha Levin <sashal at kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Comprehensive Analysis
### 1. COMMIT MESSAGE ANALYSIS
- **Purpose**: Add `-ENOKEY` to prevent pointless reconnect retries when
authentication key is missing
- **Tags**: `Tested-by` (Daniel Wagner), `Reviewed-by` (Daniel Wagner,
Hannes Reinecke), `Closes:` (lore link)
- **Missing**: No explicit `Cc: stable at vger.kernel.org` or `Fixes:` tag
- **Maintainer signoff**: Keith Busch (NVMe maintainer)
### 2. CODE CHANGE ANALYSIS
The change is minimal - single line modification:
```c
- if (status == -EKEYREJECTED)
+ if (status == -EKEYREJECTED || status == -ENOKEY)
```
**Where `-ENOKEY` is returned:**
- `drivers/nvme/host/auth.c:720` - No session key negotiated
- `drivers/nvme/host/auth.c:973` - No host key (`ctrl->host_key` is
NULL)
- `drivers/nvme/host/auth.c:978` - Controller key configured but invalid
- `drivers/nvme/host/tcp.c:1698,2080,2112,2121` - Various TLS/PSK key
failures
All these represent "key does not exist" scenarios where retrying cannot
help.
**Function impact:** `nvmf_should_reconnect()` is called by all three
NVMe fabric transports (TCP, FC, RDMA) via
`nvme_tcp_reconnect_or_remove()`, `nvme_fc_reconnect_or_delete()`, and
`nvme_rdma_reconnect_or_remove()`.
### 3. CLASSIFICATION
- **Bug fix**: Yes - fixes futile retry behavior
- **New feature**: No - extends existing error handling pattern
- **Follows established pattern**: The `-EKEYREJECTED` check was added
in v6.10 (commit 0e34bd9605f6c) with identical logic
### 4. SCOPE AND RISK ASSESSMENT
- **Lines changed**: 1
- **Files touched**: 1
- **Complexity**: Trivial
- **Risk**: Extremely low - change only affects reconnect decision for
an already-failed authentication
- **Regression potential**: Near zero - the code path only executes when
authentication already failed
### 5. USER IMPACT
- **Who is affected**: Users of NVMe Fabrics (TCP/RDMA/FC) with
authentication enabled
- **Severity without fix**: Wasteful reconnect retries, potential log
spam, resource consumption
- **Not a crash/data corruption**: This is a behavioral improvement, not
a critical fix
### 6. STABILITY INDICATORS
- Tested by Daniel Wagner (NVMe developer)
- Reviewed by Daniel Wagner and Hannes Reinecke (both storage/NVMe
experts)
- Clean, simple change with clear semantics
### 7. DEPENDENCY CHECK
- Requires commit `0e34bd9605f6c` ("nvme: do not retry authentication
failures") from v6.10
- NVMe authentication feature itself was added in v6.1 (`f50fff73d620c`)
- Backport applies cleanly to trees with the `-EKEYREJECTED` check
### Decision Rationale
**Pros for backporting:**
- Trivial one-line change with zero regression risk
- Fixes real wasteful behavior (pointless retries that can never
succeed)
- Follows existing code pattern already established
- Reviewed and tested by domain experts
- Semantically correct: `-ENOKEY` means "no key available" - retry won't
create one
**Cons for backporting:**
- No explicit `Cc: stable at vger.kernel.org` tag from maintainers
- Not a crash, security bug, or data corruption fix
- NVMe authentication is a relatively niche feature
- Bug impact is resource waste, not functional failure
**Conclusion:**
This is a low-risk, obviously correct bug fix that prevents wasteful
behavior. While it lacks explicit stable tags and isn't fixing a
critical bug, the change is so simple and safe that the benefit-to-risk
ratio strongly favors inclusion. The fix completes the authentication
error handling that was started with the `-EKEYREJECTED` check, making
it a natural complement to that existing code. Stable kernel users with
NVMe authentication would benefit from not having pointless reconnection
storms when their keys are missing.
**YES**
drivers/nvme/host/fabrics.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 2e58a7ce10905..55a8afd2efd50 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -592,7 +592,7 @@ bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status)
if (status > 0 && (status & NVME_STATUS_DNR))
return false;
- if (status == -EKEYREJECTED)
+ if (status == -EKEYREJECTED || status == -ENOKEY)
return false;
if (ctrl->opts->max_reconnects == -1 ||
--
2.51.0
More information about the Linux-nvme
mailing list