[PATCH v5 01/17] iommu/arm-smmu-v3: Make STE programming independent of the callers
Robin Murphy
robin.murphy at arm.com
Thu Feb 15 10:42:37 PST 2024
On 15/02/2024 4:01 pm, Jason Gunthorpe wrote:
> On Thu, Feb 15, 2024 at 01:49:53PM +0000, Will Deacon wrote:
>> Hi Jason,
>>
>> On Tue, Feb 06, 2024 at 11:12:38AM -0400, Jason Gunthorpe wrote:
>>> As the comment in arm_smmu_write_strtab_ent() explains, this routine has
>>> been limited to only work correctly in certain scenarios that the caller
>>> must ensure. Generally the caller must put the STE into ABORT or BYPASS
>>> before attempting to program it to something else.
>>
>> This is looking pretty good now, but I have a few comments inline.
>
> Ok
>
>>> @@ -48,6 +48,21 @@ enum arm_smmu_msi_index {
>>> ARM_SMMU_MAX_MSIS,
>>> };
>>>
>>> +struct arm_smmu_entry_writer_ops;
>>> +struct arm_smmu_entry_writer {
>>> + const struct arm_smmu_entry_writer_ops *ops;
>>> + struct arm_smmu_master *master;
>>> +};
>>> +
>>> +struct arm_smmu_entry_writer_ops {
>>> + unsigned int num_entry_qwords;
>>> + __le64 v_bit;
>>> + void (*get_used)(const __le64 *entry, __le64 *used);
>>> + void (*sync)(struct arm_smmu_entry_writer *writer);
>>> +};
>>
>> Can we avoid the indirection for now, please? I'm sure we'll want it later
>> when you extend this to CDs, but for the initial support it just makes it
>> more difficult to follow the flow. Should be a trivial thing to drop, I
>> hope.
>
> We can.
Ack, the abstraction is really hard to follow, and much of that seems
entirely self-inflicted in the amount of recalculating information which
was in-context in a previous step but then thrown away. And as best I
can tell I think it will still end up doing more CFGIs than needed.
Keeping a single monolithic check-and-update function will be *so* much
easier to understand and maintain. As far as CDs go, anything we might
reasonably want to change in a live CD is all in the first word so I
don't see any value in attempting to generalise further on that side of
things. Maybe arm_smmu_write_ctx_desc() could stand to be a bit
prettier, but honestly I don't think it's too bad as-is.
>>> +static void arm_smmu_get_ste_used(const __le64 *ent, __le64 *used_bits)
>>> {
>>> + unsigned int cfg = FIELD_GET(STRTAB_STE_0_CFG, le64_to_cpu(ent[0]));
>>> +
>>> + used_bits[0] = cpu_to_le64(STRTAB_STE_0_V);
>>> + if (!(ent[0] & cpu_to_le64(STRTAB_STE_0_V)))
>>> + return;
>>> +
>>> + /*
>>> + * See 13.5 Summary of attribute/permission configuration fields for the
>>> + * SHCFG behavior. It is only used for BYPASS, including S1DSS BYPASS,
>>> + * and S2 only.
>>> + */
>>> + if (cfg == STRTAB_STE_0_CFG_BYPASS ||
>>> + cfg == STRTAB_STE_0_CFG_S2_TRANS ||
>>> + (cfg == STRTAB_STE_0_CFG_S1_TRANS &&
>>> + FIELD_GET(STRTAB_STE_1_S1DSS, le64_to_cpu(ent[1])) ==
>>> + STRTAB_STE_1_S1DSS_BYPASS))
>>> + used_bits[1] |= cpu_to_le64(STRTAB_STE_1_SHCFG);
>>
>> Huh, SHCFG is really getting in the way here, isn't it?
>
> I wouldn't say that.. It is just a complicated bit of the spec. One of
> the things we recently did was to audit all the cache settings and, at
> least, we then realized that SHCFG was being subtly used by S2 as
> well..
Yeah, that really shouldn't be subtle; incoming attributes are replaced
by S1 translation, thus they are relevant to not-S1 configs.
I think it's likely to be significantly more straightforward to give up
on the switch statement and jump straight into the more architectural
paradigm at this level, e.g.
// Stage 1
if (cfg & BIT(0)) {
...
} else {
...
}
// Stage 2
if (cfg & BIT(1)) {
...
} else {
...
}
Thanks,
Robin.
More information about the linux-arm-kernel
mailing list