[PATCH] nvme: reserve a keep-alive admin tag for all transports

Chao S coshi036 at gmail.com
Thu May 14 22:05:23 PDT 2026


On Tue, Apr 28, 2026, Maurizio Lombardi wrote:
> The spec 2.0a, at section 5.27.1.12 Keep Alive Timer (Feature
> Identifier 0Fh) says:
> "Keep Alive Timeout (KATO):
> ...
> The default value for this field is 0h for NVMe transports that do
> not require use of the Keep Alive feature (e.g., NVMe over PCIe).
> For NVMe transports that require use of the Keep Alive feature
> (e.g., RDMA and TCP), the default value for this field is 1D4C0h"
>
> To me, it sounds like for nvme-pci, keep alive isn't required, but
> could be activated.

Thanks for the spec reference - that exact wording will be quoted in
the v2 commit message.  I am adding you to the v2 Cc list so you see
the next revision directly. Thanks again!

Chao

On Tue, Apr 28, 2026 at 3:15 AM Maurizio Lombardi <mlombard at arkamax.eu> wrote:
>
> On Tue Apr 28, 2026 at 8:47 AM CEST, Keith Busch wrote:
> > On Mon, Apr 27, 2026 at 10:29:11PM -0400, Chao Shi wrote:
> >> nvme_keep_alive_work() always allocates with BLK_MQ_REQ_RESERVED, but
> >> nvme_alloc_admin_tag_set() only sets reserved_tags for fabrics.  Since
> >> commit b58da2d270db ("nvme: update keep alive interval when kato is
> >> modified"), userspace can start keep-alive on any transport via Set
> >> Features (KATO), after which the allocation trips WARN_ON_ONCE() in
> >> blk_mq_get_tag() and fails with -EWOULDBLOCK:
> >>
> >>   nvme nvme0: keep-alive failed: -11
> >>
> >> Reserve one admin tag for keep-alive on all transports.  Fabrics keeps
> >> two, the second being for the connect command.
> >
> >> Fixes: b58da2d270db ("nvme: update keep alive interval when kato is modified")
> >>
> >> Found by FuzzNvme(Syzkaller with FEMU fuzzing framework).
> >>
> >> Acked-by: Sungwoo Kim <iam at sung-woo.kim>
> >> Acked-by: Dave Tian <daveti at purdue.edu>
> >> Acked-by: Weidong Zhu <weizhu at fiu.edu>
> >> Signed-off-by: Chao Shi <coshi036 at gmail.com>
> >> ---
> >>
> >> Reproducer (run as root on an unpatched kernel with a PCIe NVMe device):
> >
> > You have a PCI controller that doesn't return Invalid Field In Command
> > status to the KATO feature? That's weird, it's fabrics specific feature.
>
> Are you sure that it's fabrics-only?
>
> The spec 2.0a, at section 5.27.1.12 Keep Alive Timer (Feature Identifier
> 0Fh)
>
> says:
> "Keep Alive Timeout (KATO):
>
> This field specifies the timeout value for the Keep Alive feature in
> milliseconds.  [...]
> The default value for this field is 0h for NVMe transports that do not require use of the Keep Alive
> feature (e.g., NVMe over PCIe). For NVMe transports that require use of the Keep Alive feature
> (e.g., RDMA and TCP), the default value for this field is 1D4C0h "
>
> To me, it sounds like for nvme-pci, keep alive isn't required, but could
> be activated.
>
>
> Maurizio



More information about the Linux-nvme mailing list