nvme-tcp host potential bugs
Sagi Grimberg
sagi at grimberg.me
Wed Dec 8 03:33:09 PST 2021
> Hello Sagi,
>
> We would like to share and hear your inputs regarding 2 host nvme-tcp issues that we have encountered in kernel version 4.18.0-348.2.1.el8_5.x86_64
>
> First issue:
> As part of nvme_tcp_create_ctrl, ctrl->queues is being allocated per ctrl in size of queue_count
> Ctrl queue_count is being set as part of nvme_tcp_alloc_io_queues per nr_io_queues
ctrl->queues array is being allocated just once, in create_ctrl, and is
capped by opts->nr_io_queues, and should never exceed this size.
>
> We see a potential issue in the following scenario:
> A connection is being established with x I/O queues and ctrl->queues is allocated to be of size x + 1
This means that opts->nr_io_queues == x
> Assuming there is a reconnection (due to a timeout or any other reason)
> The new connection is being established with y I/O queues (where y > x)
That should not be possible. It can be that the controller refused to
accept all the queues that the host asked for, which means that
user wants z queues, controller accepted the first go x queues and in
the second go y queues where z <= y < x (according to your example).
ctrl->queues size is z
first round the host connected x of these queues, and in the second
round the host connected y of the queues.
>
> In this case, ctrl->queues was previously allocated with queue_count x + 1
> But now queue_count is being updated to y + 1
> As part of nvme_tcp_alloc_queue, we have
> struct nvme_tcp_queue *queue = &ctrl->queues[qid];
> which might lead to access an out of range memory location (When qid > x + 1])
> again, ctrl->queues was allocated with queue_count == x + 1 and not y
>
> To prove the above theory, we added some debug prints when using x == 8 and y == 48:
>
> #creating 8 I/O queues, queue_count == 9, queues points to 00000000fd0a0f0f
Yes, but what is the array size?
> Nov 30 14:02:25 nc9127122.drm.lab.emc.com kernel: nvme nvme15: creating 8 I/O queues. queues 00000000fd0a0f0f queue_count 9
> Nov 30 14:02:25 nc9127122.drm.lab.emc.com kernel: nvme nvme15: mapped 8/0/0 default/read/poll queues.
> Nov 30 14:02:25 nc9127122.drm.lab.emc.com kernel: nvme nvme15: Successfully reconnected (1 attempt)
>
> #Timeout occurs that leads to reconnecting:
> Nov 30 14:02:42 nc9127122.drm.lab.emc.com kernel: nvme nvme15: queue 0: timeout request 0x0 type 4
> Nov 30 14:02:42 nc9127122.drm.lab.emc.com kernel: nvme nvme15: starting error recovery
> Nov 30 14:02:42 nc9127122.drm.lab.emc.com kernel: nvme nvme15: failed nvme_keep_alive_end_io error=10
> Nov 30 14:02:42 nc9127122.drm.lab.emc.com kernel: nvme nvme15: Reconnecting in 10 seconds...
>
> #Creating 48 I/O queues, queue_count == 49, queues points again to 00000000fd0a0f0f
again, what is the array size?
> Nov 30 14:02:52 nc9127122.drm.lab.emc.com kernel: nvme nvme15: creating 48 I/O queues. queues 00000000fd0a0f0f queue_count 49
>
> Second issue:
> With the same above example , where x < y
> As part of the reconnection process, nvme_tcp_configure_io_queues is being called
> In this function, nvme_tcp_start_io_queues is being called with the new (y) queue_count
> Which will lead to an error (when sending the IO connect command to the block layer)
But the host won't connect x queues, it will only connect y queues.
Maybe I'm missing something?
>
> Thanks,
> Amit
>
>
> Internal Use - Confidential
I'm assuming that this is not confidential as you are posting this
to Linux-nvme, so please drop this notice from your upstream mails.
More information about the Linux-nvme
mailing list