DWord alignment on ARMv7
Will Deacon
will.deacon at arm.com
Thu Mar 3 15:54:27 PST 2016
Hi Marc,
On Thu, Mar 03, 2016 at 11:27:11PM +0100, Marc Kleine-Budde wrote:
> I'm using btrfs on am ARMv7 and it turns out, that the kernel has to
> fixup a lot of kernel originated alignment issues.
>
> See /proc/cpu/alignment (~4h of uptime):
> > System: 22304815 (btrfs_get_token_64+0x13c/0x148 [btrfs])
>
> For example, when compiling the kernel on a btrfs volume the counter
> increases by 100...1000 per second.
>
> The function shown "btrfs_get_token_64()" is defined here:
> > http://lxr.free-electrons.com/source/fs/btrfs/struct-funcs.c#L53
> ...it already uses get_unaligned_leXX accessors.
>
> Quoting a comment in arch/arm/mm/alignment.c:
>
> * ARMv6 and later CPUs can perform unaligned accesses for
> * most single load and store instructions up to word size.
> * LDM, STM, LDRD and STRD still need to be handled.
>
> But on a 32bit ARMv7 64bits are not word-sized.
>
> Is the exception and fixup overhead neglectable? Do we have to introduce
> something like HAVE_EFFICIENT_UNALIGNED_64BIT_ACCESS?
Ouch, that trap/emulate is certainly going to have an effect on your
performance. I doubt that HAVE_EFFICIENT_UNALIGNED_ACCESS applies to
types bigger than the native word size on many architectures, so my
hunch is that the btrfs code should be checking BITS_PER_LONG or similar
to establish whether or not to break the access up into word accesses.
A cursory look at the network layer indicates that kind of trick is done
over there.
Will
More information about the linux-arm-kernel
mailing list