nvme-tcp crashes the system when overloading the backend device.
Mark Ruijter
mruijter at primelogic.nl
Thu Sep 2 04:31:45 PDT 2021
Hi Sagi,
I installed a recent kernel on the system and restarted the test.
The kernel is: 5.10.57
Before the kernel would crash, I stopped the test by disconnecting the initiators.
"nvmetcli clear" did not hang and in this case, it still managed to remove the configuration.
# ls -l /sys/kernel/config/nvmet/ports/
total 0
# ls -l /sys/kernel/config/nvmet/subsystems/
total 0
However, after this I still see nvmet_tcp_wq workers that are actively running:
# ps aux | grep nvmet
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 163 0.4 0.0 0 0 ? I< 02:28 0:21 [kworker/24:0H-nvmet_tcp_wq]
root 199 0.0 0.0 0 0 ? I< 02:28 0:02 [kworker/30:0H-nvmet_tcp_wq]
root 355 0.0 0.0 0 0 ? I< 02:28 0:04 [kworker/56:0H-nvmet_tcp_wq]
root 361 0.0 0.0 0 0 ? I< 02:28 0:03 [kworker/57:0H-nvmet_tcp_wq]
root 683 1.2 0.0 0 0 ? D< 02:30 0:56 [kworker/53:1H+nvmet_tcp_wq]
root 785 1.0 0.0 0 0 ? D< 02:30 0:44 [kworker/59:1H+nvmet_tcp_wq]
root 1200 0.1 0.0 0 0 ? D< 02:30 0:08 [kworker/27:1H+nvmet_tcp_wq]
root 29212 2.0 0.0 0 0 ? I< 03:28 0:18 [kworker/31:2H-nvmet_tcp_wq]
root 32691 0.0 0.0 0 0 ? I< 02:31 0:00 [nvmet-buffered-]
root 39437 5.3 0.0 0 0 ? D< 03:32 0:35 [kworker/51:3H+nvmet_tcp_wq]
root 39440 1.8 0.0 0 0 ? I< 03:32 0:12 [kworker/59:3H-nvmet_tcp_wq]
root 39458 13.3 0.0 0 0 ? I< 03:32 1:28 [kworker/18:3H-nvmet_tcp_wq]
root 39508 7.1 0.0 0 0 ? D< 03:32 0:47 [kworker/53:4H+nvmet_tcp_wq]
root 39511 2.7 0.0 0 0 ? D< 03:32 0:17 [kworker/28:5H+nvmet_tcp_wq]
root 39520 7.7 0.0 0 0 ? D< 03:32 0:51 [kworker/52:3H+nvmet_tcp_wq]
root 39855 4.3 0.0 0 0 ? I< 03:32 0:28 [kworker/48:4H-nvmet_tcp_wq]
root 39857 3.0 0.0 0 0 ? D< 03:32 0:20 [kworker/28:7H+nvmet_tcp_wq]
root 39902 6.3 0.0 0 0 ? D< 03:32 0:41 [kworker/27:6H+nvmet_tcp_wq]
root 39928 5.3 0.0 0 0 ? D< 03:32 0:35 [kworker/25:9H+nvmet_tcp_wq]
root 39963 8.8 0.0 0 0 ? D< 03:32 0:57 [kworker/24:6H+nvmet_tcp_wq]
root 40024 3.3 0.0 0 0 ? I< 03:32 0:21 [kworker/28:9H-nvmet_tcp_wq]
root 40087 6.3 0.0 0 0 ? I< 03:32 0:41 [kworker/53:6H-nvmet_tcp_wq]
root 40169 6.1 0.0 0 0 ? D< 03:32 0:40 [kworker/59:5H+nvmet_tcp_wq]
root 40201 3.5 0.0 0 0 ? D< 03:32 0:23 [kworker/54:8H+nvmet_tcp_wq]
root 40333 0.6 0.0 0 0 ? D< 03:32 0:04 [kworker/59:7H+nvmet_tcp_wq]
root 40371 0.4 0.0 0 0 ? I< 03:32 0:03 [kworker/49:5H-nvmet_tcp_wq]
root 40375 2.5 0.0 0 0 ? I< 03:32 0:16 [kworker/20:8H-nvmet_tcp_wq]
root 40517 0.4 0.0 0 0 ? I< 03:32 0:02 [kworker/58:6H-nvmet_tcp_wq]
root 40811 2.8 0.0 0 0 ? D< 03:33 0:17 [kworker/51:9H+nvmet_tcp_wq]
root 40864 1.5 0.0 0 0 ? I< 03:33 0:09 [kworker/29:5H-nvmet_tcp_wq]
root 40891 1.7 0.0 0 0 ? I< 03:33 0:10 [kworker/17:9H-nvmet_tcp_wq]
root 40902 4.3 0.0 0 0 ? D< 03:33 0:25 [kworker/59:8H+nvmet_tcp_wq]
root 41061 3.3 0.0 0 0 ? I< 03:34 0:18 [kworker/51:10H-nvmet_tcp_wq]
root 41145 2.6 0.0 0 0 ? D< 03:34 0:14 [kworker/56:7H+nvmet_tcp_wq]
root 41278 1.3 0.0 0 0 ? I< 03:34 0:07 [kworker/22:9H-nvmet_tcp_wq]
I've attached dmesg.txt as requested.
Thanks,
On 01/09/2021, 16:47, "Sagi Grimberg" <sagi at grimberg.me> wrote:
> Hi Sagi,
>
> I can reproduce this problem with any recent kernel.
> At least all these kernels I tested suffer from the problem: 5.10.40, 5.10.57, 5.14-rc4 as well as SuSE SLES15-SP2 with kernel 5.3.18-24.37-default.
> On the initiator I use Ubuntu 20.04 LTS with kernel 5.10.0-1019.
Thanks.
>>> Is it possible to check if the R5 device has inflight commands? if not
> there is some race condition or misaccounting that prevents an orderly
> shutdown of the queues.
>
> I will double check; however, I don't think that the underlying device is the problem.
> The exact same test passes with the nvmet-rdma target.
> It only fails with the nvmet-tcp target driver.
OK, that is useful information.
>
> At far as I can tell I exhaust the budget in nvmet_tcp_io_work and requeue:
>
> 1293 } while (pending && ops < NVMET_TCP_IO_WORK_BUDGET);
> 1294
> 1295 /*
> 1296 * Requeue the worker if idle deadline period is in progress or any
> 1297 * ops activity was recorded during the do-while loop above.
> 1298 */
> 1299 if (nvmet_tcp_check_queue_deadline(queue, ops) || pending)
> 1300 queue_work_on(queue_cpu(queue), nvmet_tcp_wq, &queue->io_work);
>
> I added pr_info statements in the code to determine what is going on:
> 2021-09-01T07:15:26.944067-06:00 gold kernel: [ 5502.786914] nvmet_tcp: MARK exhausted budget: ret = 0, ops = 71
> 2021-09-01T07:15:26.944070-06:00 gold kernel: [ 5502.787455] nvmet: ctrl 49 keep-alive timer (15 seconds) expired!
> 2021-09-01T07:15:26.944072-06:00 gold kernel: [ 5502.787461] nvmet: ctrl 49 fatal error occurred!
>
> Shortly after the routine nvmet_fatal_error_handler gets triggered:
> static void nvmet_fatal_error_handler(struct work_struct *work)
> {
> struct nvmet_ctrl *ctrl =
> container_of(work, struct nvmet_ctrl, fatal_err_work);
>
> pr_err("ctrl %d fatal error occurred!\n", ctrl->cntlid);
> ctrl->ops->delete_ctrl(ctrl);
> }
>
> Some of nvme_tcp_wq workers now keep running and the number of workers keeps increasing.
> root 3686 3.3 0.0 0 0 ? I< 07:31 0:29 [kworker/11:0H-nvmet_tcp_wq]
> root 3689 12.0 0.0 0 0 ? I< 07:31 1:43 [kworker/25:0H-nvmet_tcp_wq]
> root 3695 12.0 0.0 0 0 ? I< 07:31 1:43 [kworker/55:3H-nvmet_tcp_wq]
> root 3699 5.0 0.0 0 0 ? I< 07:31 0:43 [kworker/38:1H-nvmet_tcp_wq]
> root 3704 11.5 0.0 0 0 ? I< 07:31 1:39 [kworker/21:0H-nvmet_tcp_wq]
> root 3708 12.1 0.0 0 0 ? I< 07:31 1:44 [kworker/31:0H-nvmet_tcp_wq]
>
> "nvmetcli clear" will no longer return after this and when you keep the initiators running the system eventually crashes.
>
OK, so maybe some information can help. When you reproduce this for the
first time I would dump all the threads in the system to dmesg.
So if you can do the following:
1. reproduce the hang
2. nvmetcli clear
3. echo t > /proc/sysrq-trigger
And share the dmesg output with us?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg.txt.xz
Type: application/octet-stream
Size: 58372 bytes
Desc: dmesg.txt.xz
URL: <http://lists.infradead.org/pipermail/linux-nvme/attachments/20210902/e15728a6/attachment-0001.obj>
More information about the Linux-nvme
mailing list