[RFC PATCH 04/30] iommu/arm-smmu-v3: Add support for PCI ATS
okaya at codeaurora.org
okaya at codeaurora.org
Thu Mar 2 05:11:03 PST 2017
On 2017-03-02 05:51, Jean-Philippe Brucker wrote:
> Hi Sinan,
>
> On 01/03/17 19:24, Sinan Kaya wrote:
>> On 2/27/2017 2:54 PM, Jean-Philippe Brucker wrote:
>>> /* Initialise command lazily */
>>> + if (!cmd.opcode)
>>> + arm_smmu_atc_invalidate_to_cmd(smmu, iova, size, &cmd);
>>> +
>>> + spin_lock(&smmu_group->devices_lock);
>>> +
>>> + list_for_each_entry(master, &smmu_group->devices, group_head)
>>> + arm_smmu_atc_invalidate_master(master, &cmd);
>>> +
>>> + /*
>>> + * TODO: ensure we do a sync whenever we have sent ats_queue_depth
>>> + * invalidations to the same device.
>>> + */
>>> + arm_smmu_cmdq_issue_cmd(smmu, &sync_cmd);
>>> +
>>
>> It is possible to observe ATS invalidation timeout up to 90 seconds
>> according to PCIe
>> spec. How does the current code deal with this?
>>
>
> Currently we give up waiting for sync after 100us
> (ARM_SMMU_POLL_TIMEOUT_US).
> A successful sync guarantees that all ATC invalidations since last sync
> were successful, so in case of timeout we should resend all of them.
> The delay itself isn't important at the moment, since we don't handle
> invalidaton failure at all. It's fire and forget, and I haven't come up
> with a complete solution yet.
>
> The simplest error handling would be to retry invalidation after 90
> seconds if the sync didn't complete. Then after a number of failed
> attempts, maybe try to reset the device. Given that ats_invalidate is
> generally called in irq-safe context, we would be blocking the CPU for
> minutes at a time, which seems unwise. A proper solution would be to
> postpone the unmap and return an error, although unmap errors are
> usually ignored.
>
> To avoid letting anyone remap something at that address until we're
> sure
> the invalidation succeeded, we would need to repopulate the page tables
> with the stale mapping, and add a delayed work that inspects the status
> of the invalidation and tries again if it failed. If the invalidation
> comes from the IOMMU core, we control the page tables and it should be
> doable. If it comes from mm/ however, it's more complicated. MMU
> notifiers only tell us that the mapping is going away, they don't
> provide us with a way to hold on to them. Until all stale mappings have
> been invalidated, we also need to hold on to the address space.
>
> I think the problem is complex enough to deserve a series of its own,
> once we confirm that it may happen in hardware and have a rough idea of
> the timeout values encountered.
Ok, fair enough. I think arm smmuv3 driver should follow the same design
pattern other iommu drivers are following to solve this issue as other
drivers are already handling this.
Based on what I see, if there is a timeout happenibg; your sync
operation will not complete (consumer! = producer) until timeout
finishes.
>
> Thanks,
> Jean-Philippe
More information about the linux-arm-kernel
mailing list