[PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
Jason Gunthorpe
jgg at nvidia.com
Tue Jan 2 06:48:45 PST 2024
On Tue, Jan 02, 2024 at 04:13:28PM +0800, Michael Shavit wrote:
> On Tue, Dec 19, 2023 at 9:42 PM Michael Shavit <mshavit at google.com> wrote:
> ...
> > + if (hweight8(entry_qwords_used_diff) > 1) {
> > + /*
> > + * If transitioning to the target entry with a single qword
> > + * write isn't possible, then we must first transition to an
> > + * intermediate entry. The intermediate entry may either be an
> > + * entry that melds bits of the target entry into the current
> > + * entry without disrupting the hardware, or a breaking entry if
> > + * a hitless transition to the target is impossible.
> > + */
> > +
> > + /*
> > + * Compute a staging entry that has all the bits currently
> > + * unused by HW set to their target values, such that comitting
> > + * it to the entry table woudn't disrupt the hardware.
> > + */
> > + memcpy(staging_entry, cur, writer->entry_length);
> > + writer->ops.set_unused_bits(staging_entry, target);
> > +
> > + entry_qwords_used_diff =
> > + writer->ops.get_used_qword_diff_indexes(staging_entry,
> > + target);
> > + if (hweight8(entry_qwords_used_diff) > 1) {
> > + /*
> > + * More than 1 qword is mismatched between the staging
> > + * and target entry. A hitless transition to the target
> > + * entry is not possible. Set the staging entry to be
> > + * equal to the target entry, apart from the V bit's
> > + * qword. As long as the V bit is cleared first then
> > + * writes to the subsequent qwords will not further
> > + * disrupt the hardware.
> > + */
> > + memcpy(staging_entry, target, writer->entry_length);
> > + staging_entry[0] &= ~writer->v_bit;
> > + /*
> > + * After comitting the staging entry, only the 0th qword
> > + * will differ from the target.
> > + */
> > + entry_qwords_used_diff = 1;
> > + }
> > +
> > + /*
> > + * Commit the staging entry. Note that the iteration order
> > + * matters, as we may be comitting a breaking entry in the
> > + * non-hitless case. The 0th qword which holds the valid bit
> > + * must be written first in that case.
> > + */
> > + for (i = 0; i != writer->entry_length; i++)
> > + WRITE_ONCE(cur[i], staging_entry[i]);
> > + writer->ops.sync_entry(writer);
>
> Realized while replying to your latest email that this is wrong (and
> the unit-test as well!). It's not enough to just write the 0th qword
> first if it's a breaking entry, it must also sync after that 0th qword
> write.
Right.
Jason
More information about the linux-arm-kernel
mailing list