[Bug Report] NVMe-oF/TCP - NULL Pointer Dereference in `nvmet_tcp_execute_request`

Guoqing Jiang guoqing.jiang at linux.dev
Wed Nov 15 19:28:39 PST 2023


Hi,

On 11/6/23 21:39, Alon Zahavi wrote:
> # Bug Overview
>
> ## The Bug
> There is a null-ptr-deref in `nvmet_tcp_execute_request`.
>
> ## Bug Location
> `drivers/nvme/target/tcp.c` in the function `nvmet_tcp_execute_request`.
>
> ## Bug Class
> Remote Denial of Service
>
> ## Disclaimer:
> This bug was found using Syzkaller with NVMe-oF/TCP added support.
>
> # Technical Details
>
> ## Kernel Report - NULL Pointer Dereference
> ```
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 800000003c2bc067 P4D 800000003c2bc067 PUD 3dfc5067 PMD 0
> Oops: 0010 [#1] PREEMPT SMP KASAN PTI
> CPU: 0 PID: 2363 Comm: kworker/0:1H Not tainted 6.5.0-rc1+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Workqueue: nvmet_tcp_wq nvmet_tcp_io_work
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> RSP: 0018:ffff888013b0fba8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff888013d50000 RSI: ffffffff833ddfe5 RDI: ffff88800e5a33e8
> RBP: ffff888013b0fcf0 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800e5a33e8
> R13: 0000000000000000 R14: ffff88800e5a33e0 R15: dffffc0000000000
> FS:  0000000000000000(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 0000000016faa003 CR4: 0000000000370ef0
> Call Trace:
>   <TASK>
>   nvmet_tcp_execute_request drivers/nvme/target/tcp.c:578 [inline]
>   nvmet_tcp_try_recv_data drivers/nvme/target/tcp.c:1232 [inline]
>   nvmet_tcp_try_recv_one drivers/nvme/target/tcp.c:1312 [inline]
>   nvmet_tcp_try_recv drivers/nvme/target/tcp.c:1338 [inline]
>   nvmet_tcp_io_work+0x202a/0x2990 drivers/nvme/target/tcp.c:1388
>   process_one_work+0xb54/0x18b0 kernel/workqueue.c:2597
>   worker_thread+0x663/0x1300 kernel/workqueue.c:2748
>   kthread+0x357/0x460 kernel/kthread.c:389
>   ret_from_fork+0x29/0x50 arch/x86/entry/entry_64.S:308
>   </TASK>
> Modules linked in:
> CR2: 0000000000000000
> ---[ end trace 0000000000000000 ]---
> ```
>
> ## Description
>
> ### Tracing The Bug
> In the call for `nvmet_tcp_execute_request` (see code block 1), there
> is a call to `cmd->req.execute()`.
> When executing the reproducer, the function pointer is pointing to
> NULL, thus the BUG: Unable to handle NULL pointer dereference.
>
> Code Block 1:
> ```
> static void nvmet_tcp_execute_request(struct nvmet_tcp_cmd *cmd)
> {
>      if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
>          nvmet_tcp_queue_response(&cmd->req);
>      else
>          cmd->req.execute(&cmd->req);
> }
> ```
>
> The reason why `cmd->req.execute` is NULL when we get into the
> `nvmet_tcp_execute_request` function lies in the `nvmet_req_init`
> function (drivers/nvme/target/core.c).
>
> Code Block 2:
> ```
> bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq,
>                                   struct nvmet_sq *sq, const struct
> nvmet_fabrics_ops *ops)
> {
>      ...
>
>      if (unlikely(!req->sq->ctrl))
>          /* will return an error for any non-connect command: */
>          status = nvmet_parse_connect_cmd(req);
>      else if (likely(req->sq->qid != 0))
>          status = nvmet_parse_io_cmd(req);
>      else
>          status = nvmet_parse_admin_cmd(req);
>
>    ...
> }
> ```
>
> In the `nvmet_parse_admin_cmd` and `nvmet_parse_connect_cmd`
> functions, there are some assignments for `req->execute`.
> For example, here is in code block 3, the assignment in
> `nvmet_parse_connect_command` (drivers/nvme/target/fabrics-cmd.c).
>
> Code Block 3:
> ```
> u16 nvmet_parse_connect_cmd(struct nvmet_req *req)
> {
>      struct nvme_command *cmd = req->cmd;
>
>      ...
>
>      if (cmd->connect.qid == 0)
>          req->execute = nvmet_execute_admin_connect;
>      else
>          req->execute = nvmet_execute_io_connect;
>       return 0;
> }
> ```
>
> ## Root Cause
> When executing the reproducer the `nvmet_parse_connect_cmd` is not
> being called, but execution is continuing to
> `nvmet_tcp_execute_request` .
>
> ## Reproducer
> I am adding a reproducer generated by Syzkaller with some
> optimizations and minor changes.

Could you try the change to see if it helps?

--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1062,7 +1062,7 @@ static int nvmet_tcp_done_recv_pdu(struct 
nvmet_tcp_queue *queue)
le32_to_cpu(req->cmd->common.dptr.sgl.length));

                 nvmet_tcp_handle_req_failure(queue, queue->cmd, req);
-               return 0;
+               return -EAGAIN;
         }

Thanks,
Guoqing



More information about the Linux-nvme mailing list