[PATCH] nvmet: fix pre-auth out-of-bounds heap read in Discovery Get Log Page
Chaitanya Kulkarni
chaitanyak at nvidia.com
Wed May 27 19:05:19 PDT 2026
+linux-nvme
On 5/27/26 18:49, hexlabsecurity at proton.me wrote:
> Hi Christoph Hellwig,
>
> Thanks -- the goto-cleanup ladder is cleaner and avoids the
> else-if/else nesting. v2 attached follows that flow.
>
> One small rendering note: `data_len` is size_t and `alloc_len - offset`
> promotes to unsigned long long (offset is u64), so bare `min()` trips
> the macro's __typecheck even though both operands are 64-bit unsigned.
> v2 uses `min_t(size_t, data_len, alloc_len - offset)` instead, matching
> the analogous shape in io-cmd-bdev.c:228 (`len = min_t(size_t,
> miter->length, resid);`). Happy to change it to plain `min()` if you'd
> rather restructure the types to make that work.
>
> Applies cleanly to mainline at 27fa82620cba and all current LTS HEADs
> (v6.18.33, v6.12.91, v6.6.141, v6.1.174) without modification, so a
> single Cc: stable backport still covers all five trees.
>
> I also retested on a KASAN INLINE (`kasan.fault=panic`) lab kernel:
> pre-fix `nvme get-log lpo=alloc_len len=4096` reboots the host via
> KASAN catching the OOB memcpy in nvmet_copy_to_sgl; with this patch
> the same probe returns cleanly with zero kernel-pointer qwords leaked,
> and `lpo > alloc_len` correctly returns NVME_SC_INVALID_FIELD.
>
> Suggested-by trailer added.
>
> Thanks,
> Bryam Vargas
> Independent security researcher.
> HEXLAB SAS (registration pending) -- Cali, Colombia.
>
> From 6710e68439c458d691a4fe5c7fa354404745dd0a Mon Sep 17 00:00:00 2001
> From: Bryam Vargas <hexlabsecurity at proton.me>
> Date: Wed, 27 May 2026 15:00:00 -0500
> Subject: [PATCH v2] nvmet: fix pre-auth out-of-bounds heap read in Discovery
> Get Log Page
>
> nvmet_execute_disc_get_log_page() validates only the dword alignment
> of the host-supplied Log Page Offset (lpo). The 64-bit offset is then
> added to a small kzalloc'd buffer that holds the discovery log page
> and the result is passed straight to nvmet_copy_to_sgl(), which
> memcpy()s data_len bytes out to the host with no source-side bound
> check:
>
> u64 offset = nvmet_get_log_page_offset(req->cmd); /* 64-bit host */
> size_t data_len = nvmet_get_log_page_len(req->cmd); /* 32-bit host */
> ...
> if (offset & 0x3) { ... } /* only check */
> ...
> alloc_len = sizeof(*hdr) + entry_size * discovery_log_entries(req);
> buffer = kzalloc(alloc_len, GFP_KERNEL);
> ...
> status = nvmet_copy_to_sgl(req, 0, buffer + offset, data_len);
>
> The Discovery controller is unauthenticated -- nvmet_host_allowed()
> returns true unconditionally for the discovery subsystem -- so the call
> is reachable pre-authentication by any TCP/RDMA/FC peer that can reach
> the nvmet target. With a discovery log page of ~1 KiB, an attacker
> requesting up to 4 KiB starting at offset == alloc_len reads the next
> slab page out and gets its content returned over the fabric (an
> empirical run on a default nvmet-tcp loopback target leaked 81
> canonical kernel pointers in one Get Log Page response). Pointing the
> offset at unmapped kernel memory faults the in-kernel memcpy and
> crashes (or panics, on panic_on_oops=1) the target host instead.
>
> The attacker-controlled source-side offset pattern
> "nvmet_copy_to_sgl(req, 0, buffer + ATTACKER_OFFSET, ...)" is unique
> to nvmet_execute_disc_get_log_page in the entire nvmet codebase: every
> other Get Log Page handler in admin-cmd.c either ignores lpo (and
> silently starts every response at offset 0) or tracks a local
> destination offset with a fixed source pointer.
>
> Validate the host-supplied offset against the log page size, cap the
> copy length to what is actually available, and zero-fill any remainder
> of the host transfer buffer. The zero-fill matches the existing
> short-response pattern in nvmet_execute_get_log_changed_ns()
> (admin-cmd.c) and prevents leaking transport SGL contents when the
> host asks for more bytes than the log page contains.
>
> Reported-by: Bryam Vargas <hexlabsecurity at proton.me>
> Suggested-by: Christoph Hellwig <hch at lst.de>
> Fixes: a07b4970f464 ("nvmet: add a generic NVMe target")
> Cc: stable at vger.kernel.org
> Signed-off-by: Bryam Vargas <hexlabsecurity at proton.me>
> ---
> v2: rewrote the validation flow per Christoph's suggestion -- single
> `out_free_buffer` cleanup label reached by `goto` on the offset
> overflow path, `min_t(size_t, ...)` for the capped copy length,
> one fewer level of nesting. `min_t(size_t, ...)` (rather than
> bare `min()`) because `data_len` is `size_t` and `alloc_len -
> offset` promotes to `unsigned long long` (since `offset` is
> `u64`), which trips the kernel min() __typecheck; the size_t
> rendition matches the analogous shape in io-cmd-bdev.c:228.
>
> Empirically verified on a Linux 6.12.90 KASAN INLINE +
> kasan.fault=panic VM: pre-fix `nvme get-log lpo=alloc_len
> len=4096` reboots the host via KASAN catching the OOB memcpy;
> post-fix the same probe returns cleanly with zero kernel-pointer
> qwords leaked, and `lpo > alloc_len` returns
> NVME_SC_INVALID_FIELD as intended.
>
> v1: https://lore.kernel.org/linux-nvme/ -- (search by Reported-by)
>
> drivers/nvme/target/discovery.c | 23 ++++++++++++++++++++++-
> 1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/target/discovery.c b/drivers/nvme/target/discovery.c
> index e9b35549e254..114869d16a1f 100644
> --- a/drivers/nvme/target/discovery.c
> +++ b/drivers/nvme/target/discovery.c
> @@ -166,6 +166,7 @@ static void nvmet_execute_disc_get_log_page(struct nvmet_req *req)
> u64 offset = nvmet_get_log_page_offset(req->cmd);
> size_t data_len = nvmet_get_log_page_len(req->cmd);
> size_t alloc_len;
> + size_t copy_len;
> struct nvmet_subsys_link *p;
> struct nvmet_port *r;
> u32 numrec = 0;
> @@ -242,7 +243,27 @@ static void nvmet_execute_disc_get_log_page(struct nvmet_req *req)
>
> up_read(&nvmet_config_sem);
>
> - status = nvmet_copy_to_sgl(req, 0, buffer + offset, data_len);
> + /*
> + * Validate the host-supplied log page offset before copying out.
> + * Without this check, the host controls a 64-bit byte offset into
> + * a small kzalloc'd buffer: a value past the log page lets the
> + * subsequent memcpy read adjacent kernel heap, and a value aimed
> + * at unmapped kernel memory faults the in-kernel copy and crashes
> + * the target host. The Discovery controller is unauthenticated,
> + * so the bug is reachable from any reachable fabric peer.
> + */
> + if (offset > alloc_len) {
> + req->error_loc =
> + offsetof(struct nvme_get_log_page_command, lpo);
> + status = NVME_SC_INVALID_FIELD | NVME_STATUS_DNR;
> + goto out_free_buffer;
> + }
> +
> + copy_len = min_t(size_t, data_len, alloc_len - offset);
> + status = nvmet_copy_to_sgl(req, 0, buffer + offset, copy_len);
> + if (!status && copy_len < data_len)
> + status = nvmet_zero_sgl(req, copy_len, data_len - copy_len);
> +out_free_buffer:
> kfree(buffer);
> out:
> nvmet_req_complete(req, status);
Looks good.
Reviewed-by: Chaitanya Kulkarni <kch at nvidia.com>
-ck
More information about the Linux-nvme
mailing list