[PATCH 7/8] swiotlb: respect min_align_mask
Christoph Hellwig
hch at lst.de
Fri Feb 5 05:34:17 EST 2021
On Thu, Feb 04, 2021 at 11:13:45PM +0000, Robin Murphy wrote:
>> + */
>> +static unsigned int swiotlb_align_offset(struct device *dev, u64 addr)
>> +{
>> + unsigned min_align_mask = dma_get_min_align_mask(dev);
>> +
>> + if (!min_align_mask)
>> + return 0;
>
> I doubt that's beneficial - even if the compiler can convert it into a
> csel, it'll then be doing unnecessary work to throw away a
> cheaply-calculated 0 in favour of hard-coded 0 in the one case it matters
True, I'll drop the checks.
> ;)
>
>> + return addr & min_align_mask & ((1 << IO_TLB_SHIFT) - 1);
>
> (BTW, for readability throughout, "#define IO_TLB_SIZE (1 << IO_TLB_SHIFT)"
> sure wouldn't go amiss...)
I actually had a patch doing just that, but as it is the only patch
touching swiotlb.h it caused endless rebuilds for me, so I dropped it
as it only had a few uses anyway. But I've added it back.
>> - if (alloc_size >= PAGE_SIZE)
>> + if (min_align_mask)
>> + stride = (min_align_mask + 1) >> IO_TLB_SHIFT;
>
> So this can't underflow because "min_align_mask" is actually just the
> high-order bits representing the number of iotlb slots needed to meet the
> requirement, right? (It took a good 5 minutes to realise this wasn't doing
> what I initially thought it did...)
Yes.
> In that case, a) could the local var be called something like
> iotlb_align_mask to clarify that it's *not* just a copy of the device's
> min_align_mask,
Ok.
> and b) maybe just have an unconditional initialisation that
> works either way:
>
> stride = (min_align_mask >> IO_TLB_SHIFT) + 1;
Sure.
> In fact with that, I think could just mask orig_addr with ~IO_TLB_SIZE in
> the call to check_alignment() below, or shift everything down by
> IO_TLB_SHIFT in check_alignment() itself, instead of mangling
> min_align_mask at all (I'm assuming we do need to ignore the low-order bits
> of orig_addr at this point).
Yes, we do need to ignore the low bits as they won't ever be set in
tlb_dma_addr. Not sure the shift helps as we need to mask first.
I ended up killing check_alignment entirely, in favor of a new
slot_addr helper that calculates the address based off the base and index
and which can be used in a few other places as this one.
More information about the Linux-nvme
mailing list