[PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
Michael Hope
michael.hope at linaro.org
Mon Nov 5 17:02:27 EST 2012
On 6 November 2012 02:48, Rob Herring <robherring2 at gmail.com> wrote:
>
> On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> >> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> >>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> >>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> >>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>>>>>
> >>>>>>>> While v6 can support unaligned accesses, it is optional and current
> >>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>>>>>> v6.
> >>>>>>>
> >>>>>>> not true according to the gcc changes page
> >>>>>>
> >>>>>> What are you going to believe: documentation or what the compiler
> >>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
> >>>>>> only. I guess default here means it is the default unless you change the
> >>>>>> default in your build of gcc.
> >>>>>
> >>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
> >>>>> it seems a clear bug in gcc which might hopefully get fixed.
> >>>>> Thus in this case I think it is reasonable to follow the
> >>>>> gcc documentation, otherwise the code would break for ARMv6
> >>>>> when gcc gets fixed.
> >>>>
> >>>> But the compiler can't assume the state of the U bit. I think it is
> >>>> still legal on v6 to not support unaligned accesses, but on v7 it is
> >>>> required. All the standard v6 ARM cores support it, but I'm not sure
> >>>> about custom cores or if there are SOCs with buses that don't support
> >>>> unaligned accesses properly.
> >>>
> >>> Well, I read the "...since Linux version 2.6.28" comment
> >>> in the gcc changes page in the way that they assume the
> >>> U-bit is set. (Although I'm not sure it really is???)
> >>
> >> Actually, the kernel checks the arch version and the U bit on boot,
> >> and chooses the appropriate setting for the A bit depending on the
> >> result. (See arch/arm/mm/alignment.c:alignment_init().)
> >
> > That is in the kernel itself, _after_ the decompressor has run. It is
> > not relevant to any discussion about the decompressor.
> >
> >> Currently, we depend on the CPU reset behaviour or firmware/
> >> bootloader to set the U bit for v6, but the behaviour should be
> >> correct either way, though unaligned accesses will obviously
> >> perform (much) better with U=1.
> >
> > Will someone _PLEASE_ address my initial comments against this patch
> > in light of the fact that it's now been proven _NOT_ to be just a V7
> > issue, rather than everyone seemingly buring their heads in the sand
> > over this.
>
> I tried adding -munaligned-accesses on a v6 build and still get byte
> accesses rather than unaligned word accesses. So this does seem to be a
> v7 only issue based on what gcc will currently produce. Copying Michael
> Hope who can hopefully provide some insight on why v6 unaligned accesses
> are not enabled.
This looks like a bug. Unaligned access is enabled for armv6 but
seems to only take effect for cores with Thumb-2. Here's a test case
both with unaligned field access and unaligned block copy:
struct foo
{
char a;
int b;
struct
{
int x[3];
} c;
} __attribute__((packed));
int get_field(struct foo *p)
{
return p->b;
}
int copy_block(struct foo *p, struct foo *q)
{
p->c = q->c;
}
With -march=armv7-a you get the correct:
bar:
ldr r0, [r0, #1] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
bx lr @ 21 *arm_return [length = 12]
baz:
str r4, [sp, #-4]! @ 25 *push_multi [length = 4]
mov r2, r0 @ 2 *arm_movsi_vfp/1 [length = 4]
ldr r4, [r1, #5]! @ unaligned @ 9 unaligned_loadsi/2 [length = 4]
ldr ip, [r1, #4] @ unaligned @ 10 unaligned_loadsi/2 [length = 4]
ldr r1, [r1, #8] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
str r4, [r2, #5] @ unaligned @ 12 unaligned_storesi/2 [length = 4]
str ip, [r2, #9] @ unaligned @ 13 unaligned_storesi/2 [length = 4]
str r1, [r2, #13] @ unaligned @ 14 unaligned_storesi/2 [length = 4]
ldmfd sp!, {r4}
bx lr
With -march=armv6 you get a byte-by-byte field access and a correct
unaligned block copy:
bar:
ldrb r1, [r0, #2] @ zero_extendqisi2
ldrb r3, [r0, #1] @ zero_extendqisi2
ldrb r2, [r0, #3] @ zero_extendqisi2
ldrb r0, [r0, #4] @ zero_extendqisi2
orr r3, r3, r1, asl #8
orr r3, r3, r2, asl #16
orr r0, r3, r0, asl #24
bx lr
baz:
str r4, [sp, #-4]!
mov r2, r0
ldr r4, [r1, #5]! @ unaligned
ldr ip, [r1, #4] @ unaligned
ldr r1, [r1, #8] @ unaligned
str r4, [r2, #5] @ unaligned
str ip, [r2, #9] @ unaligned
str r1, [r2, #13] @ unaligned
ldmfd sp!, {r4}
bx lr
readelf -A shows that the compiler planned to use unaligned access in
both. My suspicion is that GCC is using the extv pattern to extract
the field from memory, and that pattern is only enabled for Thumb-2
capable cores.
I've logged PR55218. We'll discuss it at our next meeting.
-- Michael
More information about the linux-arm-kernel
mailing list