[PATCH 1/2] nvme-pci: Fix multiple races in nvme_setup_io_queues()

Keith Busch kbusch at kernel.org
Tue Jun 22 18:53:31 PDT 2021


On Tue, Jun 22, 2021 at 05:59:30PM -0700, Casey Chen wrote:
> On 6/22/21 8:06 AM, Keith Busch wrote:
> > On Mon, Jun 21, 2021 at 05:27:09PM -0700, Casey Chen wrote:
> > > +	/*
> > > +	 * Free IRQ resources as soon as NVMEQ_ENABLED bit transitions
> > > +	 * from set to unset. If there is a window to it is truely freed,
> > > +	 * pci_free_irq_vectors() jumping into this window will crash.
> > > +	 * And take lock to avoid racing with pci_free_irq_vectors() in
> > > +	 * nvme_dev_disable() path.
> > > +	 */
> > > +	mutex_lock(&dev->shutdown_lock)
> > Sorry, I wasn't clear in previous review. All of the shutdown_locks
> > taken after initialization need to by mutex_trylock()'s. If you have to
> > wait to get the lock, the device is not going to be suitable for
> > continuing initialization after the lock is available.
> > 
> > And looking at this again, if trylock is successful, I think you need to
> > verify controller state is suitable for continuing the initialization.
> I assume you mean all the mutex_lock() except the one in nvme_dev_disable(),
> right ? I agree with replacing all mutex_lock() with mutex_trylock() except
> the one in nvme_dev_disable(). 

I only mean the locks you are adding. The existing usages are correct
as-is.

> But I doubt about checking controller state
> after lock being acquired. Two reasons:
>
> 1. What state can use to check ?

If the state is not NVME_CTRL_CONNECTING after successfully taking the
shutdown_lock, then the controller is not in a state that can proceed
with IO queue setup.

> Suppose there is a state set on surprise
> removal, what if the state check happen before it is set while surprise
> removal is on its way ?
> 
> 2. Even we don't check controller state after acquiring lock, the
> initialization still fail fairly soon if surprise removal happens. We don't
> sacrifice any performance.

Nothing to do with performance; we're not in io path. The suggestion is
about avoiding access to a non-existent device. The timing of a surprise
removal can always defeat such a check, so my suggestion is not critical
since we expect that to be handled anyway, but in my experience,
platforms work better if the OS prevents such access when it knows it
will fail. It's not full-proof.

But as I said, everything should work even without the check, so feel
free to omit it.



More information about the Linux-nvme mailing list