[bug report] nvme-tcp: add NVMe over TCP host driver

Sagi Grimberg sagi at grimberg.me
Wed Sep 1 07:56:55 PDT 2021



On 8/24/21 4:16 PM, Dan Carpenter wrote:
> Hello Sagi Grimberg,
> 
> The patch 3f2304f8c6d6: "nvme-tcp: add NVMe over TCP host driver"
> from Dec 3, 2018, leads to the following
> Smatch static checker warning:
> 
> 	drivers/nvme/host/multipath.c:101 nvme_kick_requeue_lists()
> 	warn: sleeping in atomic context
> 
> drivers/nvme/host/multipath.c
>      97 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
>      98 {
>      99		struct nvme_ns *ns;
>      100
> --> 101 	down_read(&ctrl->namespaces_rwsem);
>      102 	list_for_each_entry(ns, &ctrl->namespaces, list) {
>      103 		if (ns->head->disk)
>      104 			kblockd_schedule_work(&ns->head->requeue_work);
>      105 	}
>      106 	up_read(&ctrl->namespaces_rwsem);
>      107 }
> 
> This is a new Smatch warning I'm working on and it's sort of
> overwhelming because it has generates too many warnings and they're
> complicated to analyze and report.  I'm trying to send these
> automatically generated call trees to see if it will help.
> 
> nvme_fc_unregister_remoteport() <- disables preempt
> -> nvme_fc_ctrl_connectivity_loss()
>     -> nvme_reset_ctrl()
> nvme_fc_ctrl_connectivity_loss() <duplicate>
> nvme_fc_unregister_remoteport() <- disables preempt <duplicate>
> nvme_fc_exit_module() <- disables preempt
> -> nvme_fc_cleanup_for_unload()
>     -> nvme_fc_delete_controllers() <- disables preempt
>        -> nvme_delete_ctrl()
> nvme_tcp_state_change() <- disables preempt
> -> nvme_tcp_error_recovery()
>           -> nvme_change_ctrl_state()
>              -> nvme_kick_requeue_lists()
> 
> 
> I looked at the last call tree and it seems like potentially a real bug.
> 
> nvme_tcp_state_change() <- disables preempt
> -> nvme_tcp_error_recovery()
>           -> nvme_change_ctrl_state()
>              -> nvme_kick_requeue_lists()

That is a correct analysis. However from this flow, it is impossible to
step into the condition that triggers nvme_kick_requeue_lists because we
call nvme_change_ctrl_state with state NVME_CTRL_RESETTING from
nvme_tcp_error_recovery which means that the new state cannot be
NVME_CTRL_LIVE as the state will either transition to the desired state 
or fail and be unchanged.

I wander what action should we take here? make the effort to defer the
call to a workqueue context? or we can safely ignore it?



More information about the Linux-nvme mailing list