[PATCH] Revert "nvme: remove the .stop_ctrl callout"

liruozhu liruozhu at huawei.com
Thu Jun 16 19:55:46 PDT 2022


Hi all,

Any thoughts on this issue?

thanks,
Ruozhu

On 2022/6/10 17:56, Ruozhu Li wrote:

> We encountered a problem that the disconnect command hangs.
> After analyzing the log and stack, we found that the triggering
> process is as follows:
> CPU0                          CPU1
> 				nvme_rdma_error_recovery_work
>    				  nvme_rdma_teardown_io_queues
> nvme_do_delete_ctrl                 nvme_stop_queues
>    nvme_remove_namespaces
>    --clear ctrl->namespaces
> 				    nvme_start_queues
> 				    --no ns in ctrl->namespaces
>      nvme_ns_remove		    return(because ctrl is deleting)
>        blk_freeze_queue
> 	blk_mq_freeze_queue_wait
>          --wait for ns to unquiesce to clean infligt IO, hang forever
>
> This problem was not found in older kernels because we will flush
> err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not
> seem to be modified for functional reasons, the patch can be revert
> to solve the problem.
>
> Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
>
> Signed-off-by: Ruozhu Li <liruozhu at huawei.com>
> ---
>   drivers/nvme/host/core.c |  2 ++
>   drivers/nvme/host/nvme.h |  1 +
>   drivers/nvme/host/rdma.c | 12 +++++++++---
>   drivers/nvme/host/tcp.c  | 10 +++++++---
>   4 files changed, 19 insertions(+), 6 deletions(-)



More information about the Linux-nvme mailing list