[PATCH v2 7/7] iommu/arm-smmu-v3: Block ATS upon an ATC invalidation timeout
Shuai Xue
xueshuai at linux.alibaba.com
Thu Mar 19 00:41:49 PDT 2026
On 3/19/26 11:26 AM, Nicolin Chen wrote:
> On Thu, Mar 19, 2026 at 10:56:43AM +0800, Shuai Xue wrote:
>> On 3/18/26 3:15 AM, Nicolin Chen wrote:
>>> For batched ATC_INV commands, SMMU hardware only reports a timeout at the
>>> CMD_SYNC, which could follow the batch issued for multiple devices. So, it
>>> isn't straightforward to identify which command in a batch resulted in the
>>> timeout. Fortunately, the invs array has a sorted list of ATC entries. So,
>>> the issued batch must be sorted as well. This makes it possible to bisect
>>> the batch to retry the command per Stream ID and identify the master.
>>
>> Nit: The implementation is a linear per-SID retry, not a binary
>> search / bisection. Suggest rewording to:
>>
>> "retry the ATC_INV command for each unique Stream ID in the batch
>> to identify the unresponsive master"
>
> You are right. And that sounds OK.
>
>>> + step = arm_smmu_get_step_for_sid(smmu, sid);
>>> + WRITE_ONCE(step->data[1],
>>> + READ_ONCE(step->data[1]) & cpu_to_le64(~STRTAB_STE_1_EATS));
>>
>>
>> This non-atomic read-modify-write on step->data[1] can race with the
>> normal STE installation path (arm_smmu_write_entry → entry_set →
>> WRITE_ONCE).
>>
>> The error path runs from:
>>
>> __arm_smmu_domain_inv_range() (data path, no group->mutex)
>> → arm_smmu_cmdq_batch_retry()
>> → arm_smmu_master_disable_ats()
>> → arm_smmu_disable_eats_for_sid() ← NO locks on STE
>>
>> The normal STE path runs from:
>>
>> iommu_attach_device()
>> → mutex_lock(&group->mutex)
>> → arm_smmu_attach_dev()
>> → mutex_lock(&arm_smmu_asid_lock)
>> → arm_smmu_install_ste_for_dev()
>> → arm_smmu_write_entry() ← holds both mutexes
>>
>> Since the error path holds neither group->mutex nor arm_smmu_asid_lock,
>> the following race is possible:
>
> Because invalidations can be in atomic context so we can't hold
> those mutex locks.
>
>> CPU A (error path): CPU B (attach path):
>> READ data[1] = X
>> WRITE data[1] = Y (new STE config)
>> WRITE data[1] = X & ~EATS
>> // Y is lost
>>
>> This could clobber a concurrent STE update from the attach path.
>
> Oh, that's true. Maybe this:
> __le64 new, old = READ_ONCE(step->data[1]);
> [...]
> do {
> new = old & cpu_to_le64(~STRTAB_STE_1_EATS);
> } while (!try_cmpxchg64(&step->data[1], &old, new));
> ?
Yes, the cmpxchg loop looks correct to me.
Thanks.
Shuai
More information about the linux-arm-kernel
mailing list