[PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
Nicolin Chen
nicolinc at nvidia.com
Tue May 19 17:21:36 PDT 2026
On Tue, May 19, 2026 at 08:02:04PM -0300, Jason Gunthorpe wrote:
> > OK. So you are suggesting a quarantine at the driver-level only:
> >
> > 1. Driver detects ATC_INV timeout during an invalidation.
> > 2. Driver retries the commands to identify the master.
>
> I might argue to push even this out to a followup series given it is
> complex and I suspect it becomes much simpler after the batch
> removal...
I see you suggest to treat the entire batch as ATS-broken. Just to
confirm: without per-SID retry, that might falsely block a healthy
device in the ATC batch, right? The driver now batches all ATC_INV
commands via arm_smmu_invs_end_batch().
> > 3. Driver calls pci_disable_ats() and clears STE.EATS.
> > 4. Driver marks domain->invs ATS entries as BROKEN.
> > (optional since pci_disable_ats() is done?)
>
> We need to stop sending invs otherwise there will be trouble making
> forward progress.
OK. This needs a surgical invs mutation: maybe INV_TYPE_ATS_BROEKN
that you suggested.
> > 5. Driver sets master->ats_broken to fence concurrent attach:
> > arm_smmu_write_ste() and arm_smmu_ats_supported().
>
> Not sure this is needed, if we race some attach then the attach will
> re-set EATS, get another timeout and clear EATS. Doesn't seem worth
> trying to optimize for.
I didn't see that coming. master->ats_enabled && state->ats_enabled
in the commit() for a concurrent attachment would issue an ATC that
may timeout again to re-start the step 1.
And since arm_smmu_atc_inv_master() doesn't use domain->invs, it is
not affected by INV_TYPE_ATS_BROKEN. So, ATC_INV can continue to be
issued in this case.
Ah, I feel that we are walking in the mine field where every single
step could be a kaboom. But your insight is clearly a safe pathway.
> > 6. Something external triggers an FLR (sysfs or AER).
> > 7. FLR goes through pci_dev_reset_iommu_prepare()/done(). done()
> > reverts 3+4 and calls the reset_device_done callback clearing
> > master->ats_broken (5).
>
> It should restore core/driver/hw synchronization of EATS and the
> pci_enable_ats() by installing a blocking domain. Then it can go on to
> re-attach a translating domain and everything is back to correct.
Yea. We probably could drop the master->ats_broken, as done() would
be seemingly sufficient. I'll do the rework first, and see if there
might be some corner case.
> We do need to push a pci error event (didn't see that in this series)
> so the driver can catch it and start the FLR process. I suppose that
> will still need to bounce through a workqueue, and once you have that
> it can also set the blocked domain prior to calling out to the driver.
In the specific case that I am trying to tackle with this series, I
do see AER error prints from the device already but there is no FLR
process. So, I assume that, even if we push a PCI error event, that
wouldn't necessarily trigger an FLR?
Thanks
Nicolin
More information about the linux-arm-kernel
mailing list