[PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler
Leon Romanovsky
leon at kernel.org
Tue Jun 13 11:07:47 PDT 2023
On Tue, Jun 13, 2023 at 10:30:37AM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 13, 2023 at 01:43:43AM +0000, Shinichiro Kawasaki wrote:
> > > I think there is likely some much larger issue with the IW CM if the
> > > cm_id can be destroyed while the iwcm_id is in use? It is weird that
> > > there are two id memories for this :\
> >
> > My understanding about the call chain to rdma id destroy is as follows. I guess
> > _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> > why it does not wait for cm_id deref by cm_work_handler.
> >
> > nvme_rdma_teardown_io_queueus
> > nvme_rdma_stop_io_queues -> chained to cma_iw_handler
> > nvme_rdma_free_io_queues
> > nvme_rdma_free_queue
> > rdma_destroy_id
> > mutex_lock(&id_priv->handler_mutex)
> > destroy_id_handler_unlock
> > mutex_unlock(&id_priv->handler_mutex)
> > _destory_id
> > iw_destroy_cm_id
> > wait_for_completiion(&id_priv->comp)
> > kfree(id_priv)
>
> Once a destroy_cm_id() has returned that layer is no longer
> permitted to run or be running in its handlers. The iw cm is broken if
> it allows this, and that is the cause of the bug.
>
> Taking more refs within handlers that are already not allowed to be
> running is just racy.
So we need to revert that patch from our rdma-rc.
Thanks
>
> Jason
More information about the Linux-nvme
mailing list