[PATCH v8 09/12] iommu/arm-smmu-v3: Implement pm_runtime & system sleep ops

Mon Jun 15 12:44:25 PDT 2026

On Mon, Jun 15, 2026 at 06:20:27PM +0000, Mostafa Saleh wrote:
> On Mon, Jun 01, 2026 at 09:59:06PM +0000, Pranjal Shrivastava wrote:
> > Implement pm_runtime and system sleep ops for arm-smmu-v3.
> > 
> > The suspend callback configures the SMMU to abort new transactions,
> > disables the main translation unit and then drains the command queue
> > to ensure completion of any in-flight commands. A software gate
> > (STOP_FLAG) and synchronization barriers are used to quiesce the command
> > submission pipeline and ensure state consistency before power-off.
> > 
> > To prevent software metadata flags from leaking into physical registers
> > or polluting the tracking pointer, a newly introduced bitmask
> > (CMDQ_PROD_IDX_MASK) is applied to all register writes and tracking
> > updates.
> > 
> > The resume callback restores the MSI configuration and performs a full
> > device reset via `arm_smmu_device_reset` to bring the SMMU back to an
> > operational state. The MSIs are cached during the msi_write and are
> > restored during the resume operation by using the helper. The STOP_FLAG
> > is cleared only after the CMDQ is enabled in hardware.
> > 
> > Suggested-by: Daniel Mentz <danielmentz at google.com>
> > Signed-off-by: Pranjal Shrivastava <praan at google.com>
> > ---
> >  

[...]

> > +	/* Clear any flags from the previous life */
> > +	atomic_andnot(CMDQ_PROD_STOP_FLAG, &smmu->cmdq.owner_prod);
> > +	atomic_andnot(CMDQ_PROD_STOP_FLAG, &smmu->cmdq.q.llq.atomic.prod);
> 
> Should not that be done from the suspend call?

I'm not sure if I understand? We're just clearing the flag here?
We set the flag in suspend to close the gate and clear it in resume 
to re-open it. Clearing it at the end of suspend would be wrong as it
would allow new submissions while the SMMU is off..

Additionally, I'll remove the redundant operation on owner_prod (since
it's never set in owner_prod) if that's what you're saying?

> 
> > +
> >  	/* Invalidate any cached configuration */
> >  	arm_smmu_cmdq_issue_cmd_with_sync(smmu, arm_smmu_make_cmd_cfgi_all());
> >  
> > @@ -4898,6 +4939,21 @@ static int arm_smmu_device_reset(struct arm_smmu_device *smmu)
> >  	if (is_kdump_kernel())
> >  		enables &= ~(CR0_EVTQEN | CR0_PRIQEN);
> >  
> > +	/*
> > +	 * While the SMMU was suspended, concurrent CPU threads may have
> > +	 * updated in-memory structures (such as STEs, CDs, and PTEs).
> > +	 * Any invalidations corresponding to those updates were safely
> > +	 * elided because the command queue was stopped (STOP_FLAG == 1).
> > +	 *
> > +	 * Since the reset invalidate-all commands above have fully cleared
> > +	 * the HW TLBs and config caches, the SMMU will fetch these descriptors
> > +	 * directly from RAM as soon as translation is enabled.
> > +	 *
> > +	 * Add a memory barrier to collect all prior RAM writes to ensure the
> > +	 * SMMU sees a consistent view of memory before translation is enabled.
> > +	 */
> > +	smp_mb();
> 
> Should not that be dma_wmb() as this is syncing with the HW?
> 

Right..  as discussed with Daniel on the other thread, the dma_wmb()
inside the issue_cmdlist() already ensures that PTE writes have reached
RAM. I'll update the comments to clarify the barrier design here.

The first CFGI_ALL invalidation we issue on resume uses the CMDQ's 
standard submission path already includes the necessary dma_wmb().
This ensures that the hardware sees the correct state before we set
SMMUEN=1. I'll update the comment to clarify that we are relying on
this existing synchronization rather than adding a redundant barrier.

> > +
> >  	/* Enable the SMMU interface */
> >  	enables |= CR0_SMMUEN;
> >  	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0,
> > @@ -5580,6 +5636,117 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
> >  	arm_smmu_device_disable(smmu);
> >  }
> >  
> > +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev)
> > +{
> > +	struct arm_smmu_device *smmu = dev_get_drvdata(dev);
> > +	struct arm_smmu_cmdq *cmdq = &smmu->cmdq;
> > +	int timeout = ARM_SMMU_SUSPEND_TIMEOUT_US;
> > +	u32 enables, target;
> > +	int ret;
> > +
> > +	/* Abort all transactions before disable to avoid spurious bypass */
> > +	arm_smmu_update_gbpa(smmu, GBPA_ABORT, 0);
> > +
> > +	/* Disable the SMMU via CR0.EN and all queues except CMDQ */
> > +	enables = CR0_CMDQEN;
> > +	ret = arm_smmu_write_reg_sync(smmu, enables, ARM_SMMU_CR0, ARM_SMMU_CR0ACK);
> > +	if (ret) {
> > +		dev_err(smmu->dev, "failed to disable SMMU\n");
> > +		return ret;
> > +	}
> > +
> > +	/*
> > +	 * At this point the SMMU is completely disabled and won't access
> > +	 * any translation/config structures, even speculative accesses
> > +	 * aren't performed as per the IHI0070 spec (section 6.3.9.6).
> > +	 */
> > +
> > +	/* Mark the CMDQ to stop and get the target index before the stop */
> > +	target = atomic_fetch_or_relaxed(CMDQ_PROD_STOP_FLAG, &cmdq->q.llq.atomic.prod);
> 
> As Daniel mentioned, I think this shouldn't be relaxed.
> 

Ack. I agree, I mis-read the kdoc about this, I'll fix it.

> > +	target &= CMDQ_PROD_IDX_MASK;
> > +
> > +
> > +	/* Wait for the last committed owner to reach the hardware */
> > +	while ((arm_smmu_cmdq_owner_prod_idx(cmdq) != target) && --timeout)
> > +		udelay(1);
> 
> I think --timeout has an off-by-one.
> 

Good catch, I'll fix this!

Thanks,
Praan