[PATCH] nvme-pci: fix race between pci reset and nvme probe

Keith Busch kbusch at kernel.org
Mon Aug 1 07:33:01 PDT 2022


On Mon, Aug 01, 2022 at 08:57:53PM +0800, Ming Lei wrote:
> After nvme_probe() returns, device lock is released, and PCI reset
> handler may come, meantime reset work is just scheduled and should
> be in-progress.
> 
> When nvme_reset_prepare() is run, all NSs may not be allocated yet
> and each NS's request queue won't be frozen by nvme_dev_disable().
> 
> But when nvme_reset_done() is called for resetting controller, all
> NSs may have been scanned successfully, and nvme_wait_freeze() is
> called on un-frozen request queues, then wait forever.
> 
> Fix the issue by holding device lock for resetting from nvme probe.
> 
> Reported-by: Yi Zhang <yi.zhang at redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs--KPTAGP=jj+7KMe=arDv=HeGeOgs1T8vbusyk=EjXow@mail.gmail.com/#r
> Signed-off-by: Ming Lei <ming.lei at redhat.com>
> ---
>  drivers/nvme/host/pci.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 4232192e10dd..d49b1a082983 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3075,9 +3075,14 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
>  static void nvme_async_probe(void *data, async_cookie_t cookie)
>  {
>  	struct nvme_dev *dev = data;
> +	struct pci_dev *pdev = to_pci_dev(dev->dev);
>  
> +	pci_dev_lock(pdev);
> +	nvme_reset_ctrl(&dev->ctrl);
>  	flush_work(&dev->ctrl.reset_work);
>  	flush_work(&dev->ctrl.scan_work);
> +	pci_dev_unlock(pdev);
> +
>  	nvme_put_ctrl(&dev->ctrl);
>  }

When low on memory, async_schedule() falls back to calling the requested
function directly, so this would deadlock on taking the pci_dev_lock() the
second time within the probe context.

If it is successfully scheduled asynchronously, holding the lock blocks a hot
removal, which might be the only thing that can unblock the nvme reset_work
from forward progress.

If you are encountering a nvme_reset_prepare() condition during scanning, that
might indicate a failure to communicate with the end device. The scan work may
need the error handling to unblock it.

> @@ -3154,7 +3159,6 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  	dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev));
>  
> -	nvme_reset_ctrl(&dev->ctrl);
>  	async_schedule(nvme_async_probe, dev);
>  
>  	return 0;
> -- 
> 2.31.1
> 



More information about the Linux-nvme mailing list