[PATCH] nvmet: fix pre-auth out-of-bounds heap read in Discovery Get Log Page

Chaitanya Kulkarni chaitanyak at nvidia.com
Wed May 27 19:05:19 PDT 2026


+linux-nvme

On 5/27/26 18:49, hexlabsecurity at proton.me wrote:
> Hi Christoph Hellwig,
>
> Thanks -- the goto-cleanup ladder is cleaner and avoids the
> else-if/else nesting. v2 attached follows that flow.
>
> One small rendering note: `data_len` is size_t and `alloc_len - offset`
> promotes to unsigned long long (offset is u64), so bare `min()` trips
> the macro's __typecheck even though both operands are 64-bit unsigned.
> v2 uses `min_t(size_t, data_len, alloc_len - offset)` instead, matching
> the analogous shape in io-cmd-bdev.c:228 (`len = min_t(size_t,
> miter->length, resid);`). Happy to change it to plain `min()` if you'd
> rather restructure the types to make that work.
>
> Applies cleanly to mainline at 27fa82620cba and all current LTS HEADs
> (v6.18.33, v6.12.91, v6.6.141, v6.1.174) without modification, so a
> single Cc: stable backport still covers all five trees.
>
> I also retested on a KASAN INLINE (`kasan.fault=panic`) lab kernel:
> pre-fix `nvme get-log lpo=alloc_len len=4096` reboots the host via
> KASAN catching the OOB memcpy in nvmet_copy_to_sgl; with this patch
> the same probe returns cleanly with zero kernel-pointer qwords leaked,
> and `lpo > alloc_len` correctly returns NVME_SC_INVALID_FIELD.
>
> Suggested-by trailer added.
>
> Thanks,
> Bryam Vargas
> Independent security researcher.
> HEXLAB SAS (registration pending) -- Cali, Colombia.
>
>  From 6710e68439c458d691a4fe5c7fa354404745dd0a Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity at proton.me>
> Date: Wed, 27 May 2026 15:00:00 -0500
> Subject: [PATCH v2] nvmet: fix pre-auth out-of-bounds heap read in Discovery
>   Get Log Page
>
> nvmet_execute_disc_get_log_page() validates only the dword alignment
> of the host-supplied Log Page Offset (lpo).  The 64-bit offset is then
> added to a small kzalloc'd buffer that holds the discovery log page
> and the result is passed straight to nvmet_copy_to_sgl(), which
> memcpy()s data_len bytes out to the host with no source-side bound
> check:
>
>      u64 offset      = nvmet_get_log_page_offset(req->cmd);  /* 64-bit host */
>      size_t data_len = nvmet_get_log_page_len(req->cmd);     /* 32-bit host */
>      ...
>      if (offset & 0x3) { ... }                               /* only check */
>      ...
>      alloc_len = sizeof(*hdr) + entry_size * discovery_log_entries(req);
>      buffer = kzalloc(alloc_len, GFP_KERNEL);
>      ...
>      status = nvmet_copy_to_sgl(req, 0, buffer + offset, data_len);
>
> The Discovery controller is unauthenticated -- nvmet_host_allowed()
> returns true unconditionally for the discovery subsystem -- so the call
> is reachable pre-authentication by any TCP/RDMA/FC peer that can reach
> the nvmet target.  With a discovery log page of ~1 KiB, an attacker
> requesting up to 4 KiB starting at offset == alloc_len reads the next
> slab page out and gets its content returned over the fabric (an
> empirical run on a default nvmet-tcp loopback target leaked 81
> canonical kernel pointers in one Get Log Page response).  Pointing the
> offset at unmapped kernel memory faults the in-kernel memcpy and
> crashes (or panics, on panic_on_oops=1) the target host instead.
>
> The attacker-controlled source-side offset pattern
> "nvmet_copy_to_sgl(req, 0, buffer + ATTACKER_OFFSET, ...)" is unique
> to nvmet_execute_disc_get_log_page in the entire nvmet codebase: every
> other Get Log Page handler in admin-cmd.c either ignores lpo (and
> silently starts every response at offset 0) or tracks a local
> destination offset with a fixed source pointer.
>
> Validate the host-supplied offset against the log page size, cap the
> copy length to what is actually available, and zero-fill any remainder
> of the host transfer buffer.  The zero-fill matches the existing
> short-response pattern in nvmet_execute_get_log_changed_ns()
> (admin-cmd.c) and prevents leaking transport SGL contents when the
> host asks for more bytes than the log page contains.
>
> Reported-by: Bryam Vargas <hexlabsecurity at proton.me>
> Suggested-by: Christoph Hellwig <hch at lst.de>
> Fixes: a07b4970f464 ("nvmet: add a generic NVMe target")
> Cc: stable at vger.kernel.org
> Signed-off-by: Bryam Vargas <hexlabsecurity at proton.me>
> ---
> v2: rewrote the validation flow per Christoph's suggestion -- single
>      `out_free_buffer` cleanup label reached by `goto` on the offset
>      overflow path, `min_t(size_t, ...)` for the capped copy length,
>      one fewer level of nesting.  `min_t(size_t, ...)` (rather than
>      bare `min()`) because `data_len` is `size_t` and `alloc_len -
>      offset` promotes to `unsigned long long` (since `offset` is
>      `u64`), which trips the kernel min() __typecheck; the size_t
>      rendition matches the analogous shape in io-cmd-bdev.c:228.
>
>      Empirically verified on a Linux 6.12.90 KASAN INLINE +
>      kasan.fault=panic VM: pre-fix `nvme get-log lpo=alloc_len
>      len=4096` reboots the host via KASAN catching the OOB memcpy;
>      post-fix the same probe returns cleanly with zero kernel-pointer
>      qwords leaked, and `lpo > alloc_len` returns
>      NVME_SC_INVALID_FIELD as intended.
>
> v1: https://lore.kernel.org/linux-nvme/ -- (search by Reported-by)
>
>   drivers/nvme/target/discovery.c | 23 ++++++++++++++++++++++-
>   1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/target/discovery.c b/drivers/nvme/target/discovery.c
> index e9b35549e254..114869d16a1f 100644
> --- a/drivers/nvme/target/discovery.c
> +++ b/drivers/nvme/target/discovery.c
> @@ -166,6 +166,7 @@ static void nvmet_execute_disc_get_log_page(struct nvmet_req *req)
>   	u64 offset = nvmet_get_log_page_offset(req->cmd);
>   	size_t data_len = nvmet_get_log_page_len(req->cmd);
>   	size_t alloc_len;
> +	size_t copy_len;
>   	struct nvmet_subsys_link *p;
>   	struct nvmet_port *r;
>   	u32 numrec = 0;
> @@ -242,7 +243,27 @@ static void nvmet_execute_disc_get_log_page(struct nvmet_req *req)
>   
>   	up_read(&nvmet_config_sem);
>   
> -	status = nvmet_copy_to_sgl(req, 0, buffer + offset, data_len);
> +	/*
> +	 * Validate the host-supplied log page offset before copying out.
> +	 * Without this check, the host controls a 64-bit byte offset into
> +	 * a small kzalloc'd buffer: a value past the log page lets the
> +	 * subsequent memcpy read adjacent kernel heap, and a value aimed
> +	 * at unmapped kernel memory faults the in-kernel copy and crashes
> +	 * the target host. The Discovery controller is unauthenticated,
> +	 * so the bug is reachable from any reachable fabric peer.
> +	 */
> +	if (offset > alloc_len) {
> +		req->error_loc =
> +			offsetof(struct nvme_get_log_page_command, lpo);
> +		status = NVME_SC_INVALID_FIELD | NVME_STATUS_DNR;
> +		goto out_free_buffer;
> +	}
> +
> +	copy_len = min_t(size_t, data_len, alloc_len - offset);
> +	status = nvmet_copy_to_sgl(req, 0, buffer + offset, copy_len);
> +	if (!status && copy_len < data_len)
> +		status = nvmet_zero_sgl(req, copy_len, data_len - copy_len);
> +out_free_buffer:
>   	kfree(buffer);
>   out:
>   	nvmet_req_complete(req, status);

Looks good.

Reviewed-by: Chaitanya Kulkarni <kch at nvidia.com>

-ck




More information about the Linux-nvme mailing list