[PATCH] iommu/arm-smmu-v3: Make STE programming independent of the callers

Jason Gunthorpe jgg at nvidia.com
Wed Jan 10 05:10:32 PST 2024


On Sat, Jan 06, 2024 at 04:36:14PM +0800, Michael Shavit wrote:
> +/*
> + * Update the STE/CD to the target configuration. The transition from the current
> + * entry to the target entry takes place over multiple steps that attempts to make
> + * the transition hitless if possible. This function takes care not to create a
> + * situation where the HW can perceive a corrupted entry. HW is only required to
> + * have a 64 bit atomicity with stores from the CPU, while entries are many 64
> + * bit values big.
> + *
> + * The algorithm works by evolving the entry toward the target in a series of
> + * steps. Each step synchronizes with the HW so that the HW can not see an entry
> + * torn across two steps. During each step the HW can observe a torn entry that
> + * has any combination of the step's old/new 64 bit words. The algorithm
> + * objective is for the HW behavior to always be one of current behavior, V=0,
> + * or new behavior.
> + *
> + * In the most general case we can make any update in three steps:
> + *  - Disrupting the entry (V=0)
> + *  - Fill now unused bits, all bits except V
> + *  - Make valid (V=1), single 64 bit store
> + *
> + * However this disrupts the HW while it is happening. There are several
> + * interesting cases where a STE/CD can be updated without disturbing the HW
> + * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
> + * because the used bits don't intersect. We can detect this by calculating how
> + * many 64 bit values need update after adjusting the unused bits and skip the
> + * V=0 process. This relies on the IGNORED behavior described in the
> + * specification
> + */

I edited this a bit more:


/*
 * Update the STE/CD to the target configuration. The transition from the
 * current entry to the target entry takes place over multiple steps that
 * attempts to make the transition hitless if possible. This function takes care
 * not to create a situation where the HW can perceive a corrupted entry. HW is
 * only required to have a 64 bit atomicity with stores from the CPU, while
 * entries are many 64 bit values big.
 *
 * The difference between the current value and the target value is analyzed to
 * determine which of three updates are required - disruptive, hitless or no
 * change.
 *
 * In the most general disruptive case we can make any update in three steps:
 *  - Disrupting the entry (V=0)
 *  - Fill now unused qwords, execpt qword 0 which contains V
 *  - Make qword 0 have the final value and valid (V=1) with a single 64
 *    bit store
 *
 * However this disrupts the HW while it is happening. There are several
 * interesting cases where a STE/CD can be updated without disturbing the HW
 * because only a small number of bits are changing (S1DSS, CONFIG, etc) or
 * because the used bits don't intersect. We can detect this by calculating how
 * many 64 bit values need update after adjusting the unused bits and skip the
 * V=0 process. This relies on the IGNORED behavior described in the
 * specification.
 */

> +void arm_smmu_write_entry(const struct arm_smmu_entry_writer_ops *ops,
> +			  __le64 *entry, const __le64 *target)
> +{
> +	__le64 unused_update[NUM_ENTRY_QWORDS];
> +	u8 used_qword_diff;
> +	unsigned int critical_qword_index;
> +
> +	used_qword_diff = compute_qword_diff(ops, entry, target, unused_update);
> +	if (hweight8(used_qword_diff) > 1) {
> +		/*
> +		 * At least two qwords need their used bits to be changed. This
> +		 * requires a breaking update, zero the V bit, write all qwords
> +		 * but 0, then set qword 0
> +		 */
> +		unused_update[0] = entry[0] & (~ops->v_bit);
> +		entry_set(ops, entry, unused_update, 0, 1);
> +		entry_set(ops, entry, target, 1, ops->num_entry_qwords - 1);
> +		entry_set(ops, entry, target, 0, 1);
> +	} else if (hweight8(used_qword_diff) == 1) {
> +		/*
> +		 * Only one qword needs its used bits to be changed. This is a
> +		 * hitless update, update all bits the current STE is ignoring
> +		 * to their new values, then update a single qword to change the
> +		 * STE and finally 0 out any bits that are now unused in the
> +		 * target configuration.
> +		 */
> +		critical_qword_index = ffs(used_qword_diff) - 1;
> +		/*
> +		 * Skip writing unused bits in the critical qword since we'll be
> +		 * writing it in the next step anyways. This can save a sync
> +		 * when the only change is in that qword.
> +		 */
> +		unused_update[critical_qword_index] = entry[critical_qword_index];

Oh that is a neat improvement!

> +		entry_set(ops, entry, unused_update, 0, ops->num_entry_qwords);
> +		entry_set(ops, entry, target, critical_qword_index, 1);
> +		entry_set(ops, entry, target, 0, ops->num_entry_qwords);
> +	} else {
> +		/*
> +		 * If everything is working properly this shouldn't do anything
> +		 * as unused bits should always be 0 and thus can't change.
> +		 */
> +		WARN_ON_ONCE(entry_set(ops, entry, target, 0,
> +				       ops->num_entry_qwords));
> +	}
> +}
> +
> +#undef NUM_ENTRY_QWORDS

It is fine the keep the constant, it is reasonably named.

> +struct arm_smmu_ste_writer {
> +	struct arm_smmu_entry_writer_ops ops;
> +	struct arm_smmu_device *smmu;
> +	u32 sid;
> +};

I think the security focused people will not be totally happy with writable
function pointers..

So I changed it into:

struct arm_smmu_entry_writer_ops;
struct arm_smmu_entry_writer {
	const struct arm_smmu_entry_writer_ops *ops;
	struct arm_smmu_master *master;
};

struct arm_smmu_entry_writer_ops {
	unsigned int num_entry_qwords;
	__le64 v_bit;
	void (*get_used)(struct arm_smmu_entry_writer *writer, const __le64 *entry,
			 __le64 *used);
	void (*sync)(struct arm_smmu_entry_writer *writer);
};

(both ste and cd can use the master)

Jason



More information about the linux-arm-kernel mailing list