nvme_rdma - leaves provider resources allocated
Sagi Grimberg
sagi at grimberg.me
Wed Aug 24 02:31:38 PDT 2016
> Assume an nvme_rdma host has one attached controller in RECONNECTING state, and
> that controller has failed to reconnect at least once and thus is in the
> delay_schedule time before retrying the connection. At that moment, there are
> no cm_ids allocated for that controller because the admin queue and the io
> queues have been freed. So nvme_rdma cannot get a DEVICE_REMOVAL from the
> rdma_cm. This means if the underlying provider module is removed, it will be
> removed with resources still allocated by nvme_rdma. For iw_cxgb4, this causes
> a BUG_ON() in gen_pool_destroy() because MRs are still allocated for the
> controller.
>
> Thoughts on how to fix this?
Hey Steve,
I think it's time to go back to your client register proposal.
I can't think of any way to get it right at the moment...
Maybe if we can make it only do something meaningful in remove_one()
to handle device removal we can get away with it...
More information about the Linux-nvme
mailing list