[PATCH 1/2] nvme: don't schedule multiple resets

Christoph Hellwig hch at lst.de
Thu Oct 6 02:34:49 PDT 2016


On Wed, Oct 05, 2016 at 04:32:45PM -0400, Keith Busch wrote:
> The queue_work only fails if the work is pending, but not yet running. If
> the work is running, the work item would get requeued, triggering a
> double reset. If the first reset fails for any reason, the second
> reset triggers:
> 
> 	WARN_ON(dev->ctrl.state == NVME_CTRL_RESETTING)
> 
> Hitting that schedules controller deletion for a second time, which
> potentially takes a reference on the device that is being deleted.
> If the reset occurs at the same time as a hot removal event, this causes
> a double-free.
> 
> This patch has the reset helper function check if the work is busy
> prior to queueing, and changes all places that schedule resets to use
> this function. Since most users don't want to sync with that work, the
> "flush_work" is moved to the only caller that wants to sync.

Looks fine.  I actually have something very similar in an old
branch, except that I also moved nvme_reset to common code
and made the fabrics drivers use it.  I'll really need to get
back to that stuff..

Reviewed-by: Christoph Hellwig <hch at lst.de>



More information about the Linux-nvme mailing list