[PATCH] nvme: reserve a keep-alive admin tag for all transports
Chao S
coshi036 at gmail.com
Thu May 14 21:55:38 PDT 2026
On Fri, May 8, 2026, Sagi Grimberg wrote:
> Perhaps under a quirk? Would be interesting to understand which
> pci devices support this...
Thanks for the comments. I thought
always reserving an admin tag for keep-alive is simpler.
I queried the Identify Controller KAS field on the two PCIe NVMe
drives I have locally (Micron 7450 MTFDKBA960TFR and
MTFDLAL3T8THG-1BP1DFCYY), both report KAS=0. I could not find
publicly documented PCIe NVMe controller that declares KAS != 0
either.
The spec (NVMe 2.0a 5.27.1.12 plus the transport binding wording
quoted upthread by Christoph) allows PCIe to support KATO; the host
should be defensive against that legal case.
The cost is a single admin tag - NVME_AQ_MQ_TAG_DEPTH stays at 30,
the regular pool goes from 30 to 29 on non-fabrics. Fabrics has been
running with 28 regular tags without issue.
If a real PCIe controller is later found to advertise KAS != 0 and
misbehave with keep-alive, narrowing the activation with a quirk on
top of this change is straightforward. I would prefer to land the
defensive default first.
I will send v2 shortly with this rationale in
the commit message.
Chao
On Sun, May 10, 2026 at 4:53 PM Sagi Grimberg <sagi at grimberg.me> wrote:
>
>
>
> On 08/05/2026 12:31, Keith Busch wrote:
> > On Fri, May 08, 2026 at 11:04:27AM +0200, Christoph Hellwig wrote:
> >> On Tue, Apr 28, 2026 at 08:24:35AM +0100, Keith Busch wrote:
> >>>> This field specifies the timeout value for the Keep Alive feature in
> >>>> milliseconds. [...]
> >>>> The default value for this field is 0h for NVMe transports that do not require use of the Keep Alive
> >>>> feature (e.g., NVMe over PCIe). For NVMe transports that require use of the Keep Alive feature
> >>>> (e.g., RDMA and TCP), the default value for this field is 1D4C0h "
> >>>>
> >>>> To me, it sounds like for nvme-pci, keep alive isn't required, but could
> >>>> be activated.
> >>> The spec says the support is subject to the Transport binding
> >>> specification, which does not exist in the PCIe transport spec.
> >> My memories from the fabrics working group back in the day is that we
> >> explicitly intended to support it in PCIe. The wording in the spec
> >> referring to transport specs I can find is:
> >>
> >> The NVMe Transport binding specification for the associated NVMe Transport
> >> defines:
> >>
> >> o the minimum Keep Alive Timeout value, if any;
> >> o the maximum Keep Alive Timeout value, if any; and
> >> o if the Keep Alive Timer feature is required to be supported and enabled.
> >>
> >> which does not read to me like there is any required language in the
> >> transport spec to require keep alive.
> > So the absence of defining a minimum means it's simply optional? I
> > suppose I can see it that way as the intended interpretation, but seems
> > counter productive to do on PCIe when you can MMIO the controller status
> > register to verify liveness. If the controller responds successfully to
> > the feature, then I have to agree we need the host to do its part.
>
> Perhaps under a quirk? Would be interesting to understand which
> pci devices support this...
More information about the Linux-nvme
mailing list