[bug report] iommu/arm-smmu-v3: Event cannot be printed in some scenarios

Tian, Kevin kevin.tian at intel.com
Thu Jul 25 17:04:23 PDT 2024


> From: Jason Gunthorpe <jgg at ziepe.ca>
> Sent: Thursday, July 25, 2024 8:59 PM
> 
> On Thu, Jul 25, 2024 at 07:35:00AM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg at ziepe.ca>
> > > Sent: Wednesday, July 24, 2024 9:03 PM
> > > + * and the fault remains owned by the caller. The caller should log the
> DMA
> > > + * protection failure and resolve the fault. Otherwise on success the fault
> is
> > > + * always completed eventually.
> >
> > About "resolve the fault", I didn't find such logic from smmu side in
> > arm_smmu_evtq_thread(). It just logs the event. Is it asking for new
> > change in smmu driver or reflecting the current fact which if missing
> > leads to the said stall problem?
> 
> It was removed in b554e396e51c ("iommu: Make iopf_group_response()
> return void")
> 
>         ret = iommu_report_device_fault(master->dev, &fault_evt);
> -       if (ret && flt->type == IOMMU_FAULT_PAGE_REQ) {
> -               /* Nobody cared, abort the access */
> -               struct iommu_page_response resp = {
> -                       .pasid          = flt->prm.pasid,
> -                       .grpid          = flt->prm.grpid,
> -                       .code           = IOMMU_PAGE_RESP_FAILURE,
> -               };
> -               arm_smmu_page_response(master->dev, &fault_evt, &resp);
> -       }
> -
> 
> Part of the observation going into b554e396e51c was that all drivers
> have something like the above, and we can pull it into the core code.
> 
> So perhaps we should still always abort the request from
> iommu_report_device_fault() instead of requiring boilerplate like
> above in drivers. That does some better.
> 
> The return code only indicates if the event should be logged.
> 

Yes, this makes more sense. Otherwise we need also pull back
those removed lines in drivers for fault resolving.


More information about the linux-arm-kernel mailing list