[PATCH 04/11] iommu/arm-smmu: Introduce automatic stream-id-masking
Will Deacon
will.deacon at arm.com
Wed Jan 22 10:26:22 EST 2014
Hi Andreas,
This patch always requires some extra brain cycles when reviewing!
On Thu, Jan 16, 2014 at 12:44:16PM +0000, Andreas Herrmann wrote:
> Try to determine a mask that can be used for all StreamIDs of a master
> device. This allows to use just one SMR group instead of
> number-of-streamids SMR groups for a master device.
>
> Changelog:
You can put the change log and notes after the '---' so they don't appear in
the commit log, although the commit message could probably use a brief
description of your algorithm.
> * Sorting of stream IDs (to make usage of S2CR independend of sequence of
> stream IDs in DT)
> - intentionally not implemented
> - code does not rely on sorting
> - in fact sorting might make things worse with this simple
> implementation
> + Example: master with stream IDs 4, 5, 6, 0xe, 0xf requires 3
> SMRs when IDs are specified in this sorted order (one to map 4,
> 5, one to map 6, one to map 0xe, 0xf) but just 2 SMRs when
> specified as 4, 5, 0xe, 0xf, 6 (one to map 4, 5, 0xe, 0xf and
> one SMR to map 6)
> - thus by modifying the DT information you can affect the number of
> S2CRs required for stream matching
> => I'd say "use common sense" when specifying stream IDs for a master
> device in DT.
Then we probably want a comment in the driver helping people work out what
the best ordering is.
> @@ -1025,10 +1030,109 @@ static void arm_smmu_domain_destroy(struct iommu_domain *domain)
> kfree(smmu_domain);
> }
>
> +static int determine_smr_mask(struct arm_smmu_device *smmu,
> + struct arm_smmu_master *master,
> + struct arm_smmu_smr *smr, int start, int order)
> +{
> + u16 i, zero_bits_mask, one_bits_mask, const_mask;
> + int nr;
> +
> + nr = 1 << order;
> +
> + if (nr == 1) {
> + /* no mask, use streamid to match and be done with it */
> + smr->mask = 0;
> + smr->id = master->streamids[start];
> + return 0;
> + }
> +
> + zero_bits_mask = 0;
> + one_bits_mask = 0xffff;
> + for (i = start; i < start + nr; i++) {
> + zero_bits_mask |= master->streamids[i]; /* const 0 bits */
> + one_bits_mask &= master->streamids[i]; /* const 1 bits */
> + }
> + zero_bits_mask = ~zero_bits_mask;
> +
> + /* bits having constant values (either 0 or 1) */
> + const_mask = zero_bits_mask | one_bits_mask;
> +
> + i = hweight16(~const_mask);
> + if ((1 << i) == nr) {
> + smr->mask = ~const_mask;
> + smr->id = one_bits_mask;
This part always confuses me. Why do we check (1 << i) against nr? In fact,
in your example where we have SIDs {4,5,e,f,6}, then we'll call this
initially with start = 0, order = 2 and try to allocate an smr for
{4,5,e,f}. That will succeed with mask 1011b and id 0100b, but the mask has
a hamming weight of 3, which is != nr (2).
Where am I getting this wrong?
I also still need to convince myself that we can't end up generating smrs
which match the same SID. Is that what your check above is trying to handle?
> +static int determine_smr_mapping(struct arm_smmu_device *smmu,
> + struct arm_smmu_master *master,
> + struct arm_smmu_smr *smrs, int max_smrs)
> +{
> + int nr_sid, nr, i, bit, start;
> +
> + /*
> + * This function is called only once -- when a master is added
> + * to a domain. If master->num_s2crs != 0 then this master
> + * was already added to a domain.
> + */
> + BUG_ON(master->num_s2crs);
I think I'd rather WARN and return -EINVAL. We needn't kill the kernel for
this.
> +
> + start = nr = 0;
> + nr_sid = master->num_streamids;
> + do {
> + /*
> + * largest power-of-2 number of streamids for which to
> + * determine a usable mask/id pair for stream matching
> + */
> + bit = fls(nr_sid);
If you use __fls...
> + if (!bit)
> + return 0;
> +
> + /*
> + * iterate over power-of-2 numbers to determine
> + * largest possible mask/id pair for stream matching
> + * of next 2**i streamids
> + */
> + for (i = bit - 1; i >= 0; i--) {
... then you don't need this -1.
> + if(!determine_smr_mask(smmu, master,
Cosmetic: space after 'if'.
> /* It worked! Now, poke the actual hardware */
> - for (i = 0; i < master->num_streamids; ++i) {
> + for (i = 0; i < master->num_s2crs; ++i) {
> u32 reg = SMR_VALID | smrs[i].id << SMR_ID_SHIFT |
> smrs[i].mask << SMR_MASK_SHIFT;
> + dev_dbg(smmu->dev, "SMR%d: 0x%x\n", smrs[i].idx, reg);
I think we can drop the dev_dbg statements from this patch.
Will
More information about the linux-arm-kernel
mailing list