[PATCH 1/2] nvmet: fix ctlr ref counting
Sagi Grimberg
sagi at grimberg.me
Sat Sep 30 11:00:27 PDT 2017
> Current code:
> initializes the reference at allocation
> increments reference when ctlr finds are done
> has decrements to back out of ctlr finds
> decrements in admin queue connect cmd failures to
> cause ctlr free
> in nvmet_sq_destroy() decrements to cause ctlr free
>
> Issue is: nvmet_sq_destroy is called for every queue (admin and
> io), thus the controller is freed upon the first queue termination.
> It's an early "free". There could be other ctlr struct changes made
> while subsequent queues are being torn down.
This is not true, each sq takes a reference on the controller, so
the controller is actually freed only on the _last_ sq destroy.
Can you better explain what is the issue here?
>
> An illustration of use-after-free was seen where the mutex for the
> ctrl was corrupted resulting in a mutex_lock hang.
>
> Fix by:
> - make a formal control ref get routine.
> - have each sq binding to the ctlr do a ctlr get. This will
> now pair with the put on each sq destroy.
> - in sq_destroy, if destroying the admin queue, do an
> additional put so that ctlr will be torn down on last
> queue sq_destroy
The fix is not explained sufficiently too here james. You seem
to add a ref in install_queue and put it twice in sq_destroy.
I don't see how this helps.
More information about the Linux-nvme
mailing list