[PATCH v8 07/12] iommu/arm-smmu-v3: Add CMDQ_PROD_STOP_FLAG to gate CMDQ submissions

Daniel Mentz danielmentz at google.com
Sun Jun 7 14:42:37 PDT 2026


On Mon, Jun 1, 2026 at 2:59 PM Pranjal Shrivastava <praan at google.com> wrote:
>
> Introduce a new bit flag, CMDQ_PROD_STOP_FLAG (bit 30), in the command
> queue's producer index to safely gate command submissions during device
> suspension.
>
> The flag embeds the suspend state directly into the existing global state
> The flag checked in the cmpxchg loop in arm_smmu_cmdq_issue_cmdlist(),
> which acts as a Point of Commitment, ensuring that no indices are
> reserved or committed once the SMMU begins suspending.
>
> This prevents a situation of "abandoned batches" where indices are
> incremented but commands are never written, which would otherwise
> lead to timeout during the drain poll.
>
> Update queue_inc_prod_n() to preserve this flag during index
> calculations, ensuring that any in-flight commands that successfully
> passed the point of commitment can proceed to completion while the
> flag remains set.
>
> Suggested-by: Daniel Mentz <danielmentz at google.com>
> Reviewed-by: Nicolin Chen <nicolinc at nvidia.com>
> Signed-off-by: Pranjal Shrivastava <praan at google.com>
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 20 +++++++++++++++++++-
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h |  2 ++
>  2 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 6c8631e2f153..c4315bdde7d5 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -210,7 +210,8 @@ static int queue_sync_prod_in(struct arm_smmu_queue *q)
>  static u32 queue_inc_prod_n(struct arm_smmu_ll_queue *q, int n)
>  {
>         u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + n;
> -       return Q_OVF(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
> +
> +       return Q_OVF(q->prod) | Q_STOP(q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
>  }
>
>  static void queue_poll_init(struct arm_smmu_device *smmu,
> @@ -718,8 +719,25 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
>         do {
>                 u64 old;
>
> +               /*
> +                * If the SMMU is suspended/suspending, any new CMDs are elided.
> +                * This loop is the Point of Commitment. If we haven't cmpxchg'd
> +                * our new indices yet, we can safely bail. Once the indices are
> +                * committed, we MUST write valid commands to those slots to
> +                * avoid indefinite polling in the drain function.
> +                */
> +               if (Q_STOP(llq.prod)) {
> +                       local_irq_restore(flags);
> +                       return 0;

On second thought, I no longer believe that this is safe. I understand
that READ_ONCE(cmdq->q.llq.val) implies no ordering guarantees with
respect to any writes to translation tables. In the non-stopped case,
we can rely on the call to dma_wmb() further down in this function.
However, for the stopped case, I can't identify any barriers that
would ensure that the STOP flag is checked only after the writes are
visible to SMMU. Here is an example: Let's assume the following
program order:

 * Write to invalidate PTE
 * Read from cmdq->q.llq.val, determine STOP flag is set, elide TLBI

What prevents the CPU from reordering these operations as follows?

 * Read from cmdq->q.llq.val, determine STOP flag is set, elide TLBI
 * Write to invalidate PTE

Can the following situation occur?

 * Read from cmdq->q.llq.val, determine STOP flag is set, elide TLBI
 * (Different CPU resumes SMMU)
 * SMMU loads old PTE value into TLB
 * Write to invalidate PTE
 * (stale PTE remains in TLB)

I propose the following: If you find the STOP flag set, run dma_mb()
and check again. I'm afraid that running dma_mb() unconditionally
might incur too much of a performance penalty.

I have the same concerns with arm_smmu_cmdq_can_elide(): That
READ_ONCE in arm_smmu_cmdq_can_elide() provides no guarantees in
regards to ordering relative to PTE writes.

>
> +               }
> +
>                 while (!queue_has_space(&llq, n + sync)) {
>                         local_irq_restore(flags);
> +
> +                       /* Avoid waiting for space if the SMMU is suspending */
> +                       if (Q_STOP(READ_ONCE(cmdq->q.llq.prod)))
> +                               return 0;
> +
>                         if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
>                                 dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
>                         local_irq_save(flags);



More information about the linux-arm-kernel mailing list