[PATCH] iommu/arm-smmu-v3: Make STE programming independent of the callers
Jason Gunthorpe
jgg at nvidia.com
Wed Jan 10 05:10:32 PST 2024
On Sat, Jan 06, 2024 at 04:36:14PM +0800, Michael Shavit wrote:
> +/*
> + * Update the STE/CD to the target configuration. The transition from the current
> + * entry to the target entry takes place over multiple steps that attempts to make
> + * the transition hitless if possible. This function takes care not to create a
> + * situation where the HW can perceive a corrupted entry. HW is only required to
> + * have a 64 bit atomicity with stores from the CPU, while entries are many 64
> + * bit values big.
> + *
> + * The algorithm works by evolving the entry toward the target in a series of
> + * steps. Each step synchronizes with the HW so that the HW can not see an entry
> + * torn across two steps. During each step the HW can observe a torn entry that
> + * has any combination of the step's old/new 64 bit words. The algorithm
> + * objective is for the HW behavior to always be one of current behavior, V=0,
> + * or new behavior.
> + *
> + * In the most general case we can make any update in three steps:
> + * - Disrupting the entry (V=0)
> + * - Fill now unused bits, all bits except V
> + * - Make valid (V=1), single 64 bit store
> + *
> + * However this disrupts the HW while it is happening. There are several
> + * interesting cases where a STE/CD can be updated without disturbing the HW
> + * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
> + * because the used bits don't intersect. We can detect this by calculating how
> + * many 64 bit values need update after adjusting the unused bits and skip the
> + * V=0 process. This relies on the IGNORED behavior described in the
> + * specification
> + */
I edited this a bit more:
/*
* Update the STE/CD to the target configuration. The transition from the
* current entry to the target entry takes place over multiple steps that
* attempts to make the transition hitless if possible. This function takes care
* not to create a situation where the HW can perceive a corrupted entry. HW is
* only required to have a 64 bit atomicity with stores from the CPU, while
* entries are many 64 bit values big.
*
* The difference between the current value and the target value is analyzed to
* determine which of three updates are required - disruptive, hitless or no
* change.
*
* In the most general disruptive case we can make any update in three steps:
* - Disrupting the entry (V=0)
* - Fill now unused qwords, execpt qword 0 which contains V
* - Make qword 0 have the final value and valid (V=1) with a single 64
* bit store
*
* However this disrupts the HW while it is happening. There are several
* interesting cases where a STE/CD can be updated without disturbing the HW
* because only a small number of bits are changing (S1DSS, CONFIG, etc) or
* because the used bits don't intersect. We can detect this by calculating how
* many 64 bit values need update after adjusting the unused bits and skip the
* V=0 process. This relies on the IGNORED behavior described in the
* specification.
*/
> +void arm_smmu_write_entry(const struct arm_smmu_entry_writer_ops *ops,
> + __le64 *entry, const __le64 *target)
> +{
> + __le64 unused_update[NUM_ENTRY_QWORDS];
> + u8 used_qword_diff;
> + unsigned int critical_qword_index;
> +
> + used_qword_diff = compute_qword_diff(ops, entry, target, unused_update);
> + if (hweight8(used_qword_diff) > 1) {
> + /*
> + * At least two qwords need their used bits to be changed. This
> + * requires a breaking update, zero the V bit, write all qwords
> + * but 0, then set qword 0
> + */
> + unused_update[0] = entry[0] & (~ops->v_bit);
> + entry_set(ops, entry, unused_update, 0, 1);
> + entry_set(ops, entry, target, 1, ops->num_entry_qwords - 1);
> + entry_set(ops, entry, target, 0, 1);
> + } else if (hweight8(used_qword_diff) == 1) {
> + /*
> + * Only one qword needs its used bits to be changed. This is a
> + * hitless update, update all bits the current STE is ignoring
> + * to their new values, then update a single qword to change the
> + * STE and finally 0 out any bits that are now unused in the
> + * target configuration.
> + */
> + critical_qword_index = ffs(used_qword_diff) - 1;
> + /*
> + * Skip writing unused bits in the critical qword since we'll be
> + * writing it in the next step anyways. This can save a sync
> + * when the only change is in that qword.
> + */
> + unused_update[critical_qword_index] = entry[critical_qword_index];
Oh that is a neat improvement!
> + entry_set(ops, entry, unused_update, 0, ops->num_entry_qwords);
> + entry_set(ops, entry, target, critical_qword_index, 1);
> + entry_set(ops, entry, target, 0, ops->num_entry_qwords);
> + } else {
> + /*
> + * If everything is working properly this shouldn't do anything
> + * as unused bits should always be 0 and thus can't change.
> + */
> + WARN_ON_ONCE(entry_set(ops, entry, target, 0,
> + ops->num_entry_qwords));
> + }
> +}
> +
> +#undef NUM_ENTRY_QWORDS
It is fine the keep the constant, it is reasonably named.
> +struct arm_smmu_ste_writer {
> + struct arm_smmu_entry_writer_ops ops;
> + struct arm_smmu_device *smmu;
> + u32 sid;
> +};
I think the security focused people will not be totally happy with writable
function pointers..
So I changed it into:
struct arm_smmu_entry_writer_ops;
struct arm_smmu_entry_writer {
const struct arm_smmu_entry_writer_ops *ops;
struct arm_smmu_master *master;
};
struct arm_smmu_entry_writer_ops {
unsigned int num_entry_qwords;
__le64 v_bit;
void (*get_used)(struct arm_smmu_entry_writer *writer, const __le64 *entry,
__le64 *used);
void (*sync)(struct arm_smmu_entry_writer *writer);
};
(both ste and cd can use the master)
Jason
More information about the linux-arm-kernel
mailing list