[PATCH 1/6] IB/core: add support for draining Shared receive queues
Sagi Grimberg
sagi at grimberg.me
Wed Jun 19 02:09:09 PDT 2024
On 18/06/2024 3:10, Max Gurtovoy wrote:
> To avoid leakage for QPs assocoated with SRQ, according to IB spec
> (section 10.3.1):
>
> "Note, for QPs that are associated with an SRQ, the Consumer should take
> the QP through the Error State before invoking a Destroy QP or a Modify
> QP to the Reset State. The Consumer may invoke the Destroy QP without
> first performing a Modify QP to the Error State and waiting for the Affiliated
> Asynchronous Last WQE Reached Event. However, if the Consumer
> does not wait for the Affiliated Asynchronous Last WQE Reached Event,
> then WQE and Data Segment leakage may occur. Therefore, it is good
> programming practice to teardown a QP that is associated with an SRQ
> by using the following process:
> - Put the QP in the Error State;
> - wait for the Affiliated Asynchronous Last WQE Reached Event;
> - either:
> - drain the CQ by invoking the Poll CQ verb and either wait for CQ
> to be empty or the number of Poll CQ operations has exceeded
> CQ capacity size; or
> - post another WR that completes on the same CQ and wait for this
> WR to return as a WC;
> - and then invoke a Destroy QP or Reset QP."
>
> Catch the Last WQE Reached Event in the core layer without involving the
> ULP drivers with extra logic during drain and destroy QP flows.
>
> The "Last WQE Reached" event will only be send on the errant QP, for
> signaling that the SRQ flushed all the WQEs that have been dequeued from
> the SRQ for processing by the associated QP.
>
> Signed-off-by: Max Gurtovoy <mgurtovoy at nvidia.com>
> ---
> drivers/infiniband/core/verbs.c | 83 ++++++++++++++++++++++++++++++++-
> include/rdma/ib_verbs.h | 2 +
> 2 files changed, 84 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 94a7f3b0c71c..9e4df7d40e0c 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1101,6 +1101,16 @@ EXPORT_SYMBOL(ib_destroy_srq_user);
>
> /* Queue pairs */
>
> +static void __ib_qp_event_handler(struct ib_event *event, void *context)
> +{
> + struct ib_qp *qp = event->element.qp;
> +
> + if (event->event == IB_EVENT_QP_LAST_WQE_REACHED)
> + complete(&qp->srq_completion);
> + else if (qp->registered_event_handler)
> + qp->registered_event_handler(event, qp->qp_context);
There is no reason what-so-ever to withhold the LAST_WQE_REACHED from
the ulp.
The ULP may be interested in consuming this event.
This should become:
+static void __ib_qp_event_handler(struct ib_event *event, void *context)
+{
+ struct ib_qp *qp = event->element.qp;
+
+ if (event->event == IB_EVENT_QP_LAST_WQE_REACHED)
+ complete(&qp->srq_completion);
+ if (qp->registered_event_handler)
+ qp->registered_event_handler(event, qp->qp_context);
> +}
> +
> static void __ib_shared_qp_event_handler(struct ib_event *event, void *context)
> {
> struct ib_qp *qp = context;
> @@ -1221,7 +1231,10 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
> qp->qp_type = attr->qp_type;
> qp->rwq_ind_tbl = attr->rwq_ind_tbl;
> qp->srq = attr->srq;
> - qp->event_handler = attr->event_handler;
> + if (qp->srq)
> + init_completion(&qp->srq_completion);
I think that if you unconditionally complete, you should also
unconditionally initialize.
More information about the Linux-nvme
mailing list