[PATCH 04/19] iommu/arm-smmu-v3: Make STE programming independent of the callers
Jason Gunthorpe
jgg at nvidia.com
Wed Oct 18 06:04:55 PDT 2023
On Wed, Oct 18, 2023 at 07:05:49PM +0800, Michael Shavit wrote:
> > FWIW, I found the most difficult part the used bit calculation, not
> > the update algorithm. Difficult because it is hard to read and find in
> > the spec when things are INGORED, but it is a "straightforward" job of
> > finding INGORED cases and making the used bits 0.
>
> The update algorithm is the part I'm finding much harder to read and
> review :) . arm_smmu_write_entry_step in particular is hard to read
> through; on top of which there's some subtle dependencies between loop
> iterations that weren't obvious to grok:
Yes, you have it right, it is basically a classic greedy
algorithm. Let's improve the comment.
> * Relying on the used_bits to be recomputed after the first iteration
> where V=0 was set to 0 so that more bits can now be set.
> * The STE having to be synced between iterations to prevent broken STE
> reads by the SMMU (there's a comment somewhere else in arm-smmu-v3.c
> that would fit nicely here instead). But the caller is responsible for
> calling this between iterations for some reason (supposedly to support
> CD entries as well in the next series)
Yes, for CD entry support.
How about:
/*
* This algorithm updates any STE/CD to any value without creating a situation
* where the HW can percieve a corrupted entry. HW is only required to have a 64
* bit atomicity with stores from the CPU, while entires are many 64 bit values
* big.
*
* The algorithm works by evolving the entry toward the target in a series of
* steps. Each step synchronizes with the HW so that the HW can not see an entry
* torn across two steps. Upon each call cur/cur_used reflect the current
* synchronized value seen by the HW.
*
* During each step the HW can observe a torn entry that has any combination of
* the step's old/new 64 bit words. The algorithm objective is for the HW
* behavior to always be one of current behavior, V=0, or new behavior, during
* each step, and across all steps.
*
* At each step one of three actions is choosen to evolve cur to target:
* - Update all unused bits with their target values.
* This relies on the IGNORED behavior described in the specification
* - Update a single 64-bit value
* - Update all unused bits and set V=0
*
* The last two actions will cause cur_used to change, which will then allow the
* first action on the next step.
*
* In the most general case we can make any update in three steps:
* - Disrupting the entry (V=0)
* - Fill now unused bits, all bits except V
* - Make valid (V=1), single 64 bit store
*
* However this disrupts the HW while it is happening. There are several
* interesting cases where a STE/CD can be updated without disturbing the HW
* because only a small number of bits are changing (S1DSS, CONFIG, etc) or
* because the used bits don't intersect. We can detect this by calculating how
* many 64 bit values need update after adjusting the unused bits and skip the
* V=0 process.
*/
static bool arm_smmu_write_entry_step(__le64 *cur, const __le64 *cur_used,
Jason
More information about the linux-arm-kernel
mailing list