[PATCH] nvme-pci: fix race between pci reset and nvme probe
Keith Busch
kbusch at kernel.org
Mon Aug 1 07:33:01 PDT 2022
On Mon, Aug 01, 2022 at 08:57:53PM +0800, Ming Lei wrote:
> After nvme_probe() returns, device lock is released, and PCI reset
> handler may come, meantime reset work is just scheduled and should
> be in-progress.
>
> When nvme_reset_prepare() is run, all NSs may not be allocated yet
> and each NS's request queue won't be frozen by nvme_dev_disable().
>
> But when nvme_reset_done() is called for resetting controller, all
> NSs may have been scanned successfully, and nvme_wait_freeze() is
> called on un-frozen request queues, then wait forever.
>
> Fix the issue by holding device lock for resetting from nvme probe.
>
> Reported-by: Yi Zhang <yi.zhang at redhat.com>
> Link: https://lore.kernel.org/linux-block/CAHj4cs--KPTAGP=jj+7KMe=arDv=HeGeOgs1T8vbusyk=EjXow@mail.gmail.com/#r
> Signed-off-by: Ming Lei <ming.lei at redhat.com>
> ---
> drivers/nvme/host/pci.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 4232192e10dd..d49b1a082983 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -3075,9 +3075,14 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
> static void nvme_async_probe(void *data, async_cookie_t cookie)
> {
> struct nvme_dev *dev = data;
> + struct pci_dev *pdev = to_pci_dev(dev->dev);
>
> + pci_dev_lock(pdev);
> + nvme_reset_ctrl(&dev->ctrl);
> flush_work(&dev->ctrl.reset_work);
> flush_work(&dev->ctrl.scan_work);
> + pci_dev_unlock(pdev);
> +
> nvme_put_ctrl(&dev->ctrl);
> }
When low on memory, async_schedule() falls back to calling the requested
function directly, so this would deadlock on taking the pci_dev_lock() the
second time within the probe context.
If it is successfully scheduled asynchronously, holding the lock blocks a hot
removal, which might be the only thing that can unblock the nvme reset_work
from forward progress.
If you are encountering a nvme_reset_prepare() condition during scanning, that
might indicate a failure to communicate with the end device. The scan work may
need the error handling to unblock it.
> @@ -3154,7 +3159,6 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>
> dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev));
>
> - nvme_reset_ctrl(&dev->ctrl);
> async_schedule(nvme_async_probe, dev);
>
> return 0;
> --
> 2.31.1
>
More information about the Linux-nvme
mailing list