[PATCH] NVMe: do not touch sq door bell if nvmeq has been suspended

Keith Busch keith.busch at intel.com
Wed Feb 3 06:41:24 PST 2016


On Tue, Feb 02, 2016 at 07:15:57AM +0000, Wenbo Wang wrote:
> I did the following test to validate the issue.
> 
> 1. Modify code as below to increase the chance of races.
> 	Add 10s delay after nvme_dev_unmap() in nvme_dev_disable()
> 	Add 10s delay before __nvme_submit_cmd()
> 2. Run dd and at the same time, echo 1 to reset_controller to trigger device reset. Finally kernel crashes due to accessing unmapped door bell register.
> 
> Following is the execution order of the two code paths:
> __blk_mq_run_hw_queue
>   Test BLK_MQ_S_STOPPED
> 					nvme_dev_disable()
> 					     nvme_stop_queues()  <-- set BLK_MQ_S_STOPPED
> 					     nvme_dev_unmap(dev)  <-- unmap door bell
>   nvme_queue_rq()
>       Touch door bell	<-- panic here

Does the following force the first to complete before the unmap?

---
@@ -1415,10 +1421,21 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl)
 
 		blk_mq_cancel_requeue_work(ns->queue);
 		blk_mq_stop_hw_queues(ns->queue);
+		blk_sync_queue(ns->queue);
 	}
 	mutex_unlock(&ctrl->namespaces_mutex);
 }
--



More information about the Linux-nvme mailing list