[PATCH v2] RDMA/cma: prevent rdma id destroy during cma_iw_handler

Leon Romanovsky leon at kernel.org
Tue Jun 13 11:07:47 PDT 2023


On Tue, Jun 13, 2023 at 10:30:37AM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 13, 2023 at 01:43:43AM +0000, Shinichiro Kawasaki wrote:
> > > I think there is likely some much larger issue with the IW CM if the
> > > cm_id can be destroyed while the iwcm_id is in use? It is weird that
> > > there are two id memories for this :\
> > 
> > My understanding about the call chain to rdma id destroy is as follows. I guess
> > _destory_id calls iw_destory_cm_id before destroying the rdma id, but not sure
> > why it does not wait for cm_id deref by cm_work_handler.
> > 
> > nvme_rdma_teardown_io_queueus
> >  nvme_rdma_stop_io_queues -> chained to cma_iw_handler
> >  nvme_rdma_free_io_queues
> >   nvme_rdma_free_queue
> >    rdma_destroy_id
> >     mutex_lock(&id_priv->handler_mutex)
> >     destroy_id_handler_unlock
> >      mutex_unlock(&id_priv->handler_mutex)
> >      _destory_id
> >        iw_destroy_cm_id
> >        wait_for_completiion(&id_priv->comp)
> >        kfree(id_priv)
> 
> Once a destroy_cm_id() has returned that layer is no longer
> permitted to run or be running in its handlers. The iw cm is broken if
> it allows this, and that is the cause of the bug.
> 
> Taking more refs within handlers that are already not allowed to be
> running is just racy.

So we need to revert that patch from our rdma-rc.

Thanks

> 
> Jason



More information about the Linux-nvme mailing list