[PATCH 1/6] IB/core: add support for draining Shared receive queues

Sagi Grimberg sagi at grimberg.me
Wed Jun 19 02:09:09 PDT 2024



On 18/06/2024 3:10, Max Gurtovoy wrote:
> To avoid leakage for QPs assocoated with SRQ, according to IB spec
> (section 10.3.1):
>
> "Note, for QPs that are associated with an SRQ, the Consumer should take
> the QP through the Error State before invoking a Destroy QP or a Modify
> QP to the Reset State. The Consumer may invoke the Destroy QP without
> first performing a Modify QP to the Error State and waiting for the Affiliated
> Asynchronous Last WQE Reached Event. However, if the Consumer
> does not wait for the Affiliated Asynchronous Last WQE Reached Event,
> then WQE and Data Segment leakage may occur. Therefore, it is good
> programming practice to teardown a QP that is associated with an SRQ
> by using the following process:
>   - Put the QP in the Error State;
>   - wait for the Affiliated Asynchronous Last WQE Reached Event;
>   - either:
>     - drain the CQ by invoking the Poll CQ verb and either wait for CQ
>       to be empty or the number of Poll CQ operations has exceeded
>       CQ capacity size; or
>     - post another WR that completes on the same CQ and wait for this
>       WR to return as a WC;
>   - and then invoke a Destroy QP or Reset QP."
>
> Catch the Last WQE Reached Event in the core layer without involving the
> ULP drivers with extra logic during drain and destroy QP flows.
>
> The "Last WQE Reached" event will only be send on the errant QP, for
> signaling that the SRQ flushed all the WQEs that have been dequeued from
> the SRQ for processing by the associated QP.
>
> Signed-off-by: Max Gurtovoy <mgurtovoy at nvidia.com>
> ---
>   drivers/infiniband/core/verbs.c | 83 ++++++++++++++++++++++++++++++++-
>   include/rdma/ib_verbs.h         |  2 +
>   2 files changed, 84 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 94a7f3b0c71c..9e4df7d40e0c 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -1101,6 +1101,16 @@ EXPORT_SYMBOL(ib_destroy_srq_user);
>   
>   /* Queue pairs */
>   
> +static void __ib_qp_event_handler(struct ib_event *event, void *context)
> +{
> +	struct ib_qp *qp = event->element.qp;
> +
> +	if (event->event == IB_EVENT_QP_LAST_WQE_REACHED)
> +		complete(&qp->srq_completion);
> +	else if (qp->registered_event_handler)
> +		qp->registered_event_handler(event, qp->qp_context);

There is no reason what-so-ever to withhold the LAST_WQE_REACHED from 
the ulp.
The ULP may be interested in consuming this event.

This should become:

+static void __ib_qp_event_handler(struct ib_event *event, void *context)
+{
+	struct ib_qp *qp = event->element.qp;
+
+	if (event->event == IB_EVENT_QP_LAST_WQE_REACHED)
+		complete(&qp->srq_completion);
+	if (qp->registered_event_handler)
+		qp->registered_event_handler(event, qp->qp_context);




> +}
> +
>   static void __ib_shared_qp_event_handler(struct ib_event *event, void *context)
>   {
>   	struct ib_qp *qp = context;
> @@ -1221,7 +1231,10 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
>   	qp->qp_type = attr->qp_type;
>   	qp->rwq_ind_tbl = attr->rwq_ind_tbl;
>   	qp->srq = attr->srq;
> -	qp->event_handler = attr->event_handler;
> +	if (qp->srq)
> +		init_completion(&qp->srq_completion);

I think that if you unconditionally complete, you should also 
unconditionally initialize.



More information about the Linux-nvme mailing list