[PATCH] ARM: tlb: __flush_tlb_mm need to use int asid var for BE correct operation

Victor Kamensky victor.kamensky at linaro.org
Mon Oct 7 02:47:37 EDT 2013


Hi Will,

Could you please review patch following this cover letter. I've run
into this problem when Linaro big endian topic branch received 3.12-rc2 
update and as result got your f0915781bd5edf78b1154e61efe962dc15872d09
"ARM: tlb: don't perform inner-shareable invalidation for local TLB ops"
commit. My Big Endian Arndale image was getting very weird user-land
crashes while fork glib call was performed.

It turns out that in __flush_tlb_mm function code passes 'ASID(mm)'
directly to tlb_op macro, unlike in other functions (i.e local_flush_tlb_mm)
where asid initially assigned to local variable with 'int' type and then
such variable is passed to tlb_op macro. Direct use of 'ASID(mm)'
in tlb_op macro does not work in big endian case. Resulting generated
code look like this (from flush_tlb_mm that uses __flush_tlb_mm):

(gdb) disassemble flush_tlb_mm
Dump of assembler code for function flush_tlb_mm:
<snip>
   0x80011984 <+32>:	dsb	ishst
   0x80011988 <+36>:	movs	r4, #0     <---------------------
   0x8001198a <+38>:	ldrb.w	r5, [r2, #375]	; 0x177
   0x8001198e <+42>:	ldr	r3, [r3, #8]
   0x80011990 <+44>:	tst.w	r3, #65536	; 0x10000
   0x80011994 <+48>:	it	ne
   0x80011996 <+50>:	mcrne	15, 0, r5, cr8, cr7, {2}
   0x8001199a <+54>:	tst.w	r1, #4194304	; 0x400000
   0x8001199e <+58>:	it	ne
   0x800119a0 <+60>:	mcrne	15, 0, r4, cr8, cr3, {2} <-------
   0x800119a4 <+64>:	dsb	ish

'15, 0, r4, cr8, cr3, {2}' receives 0 through r4 register.
Note that in LE case correct code is generated. 
If I change code to use intermediate int variable asid as
other places do, correct code is generated for __flush_tlb_mm.

My explanation is that ASID macro actually produces
64 bit integer. When passed to inline assembler as 'r'
operand is not really defined behaviour and it is 
wrong in BE case. When intermidiate 'int' variable is used 
ASID macro value is properly converted to 32 bit integer.
And inline assembler received int type, which works
correctly.

Note other possible ways to fix it is to add 'int' cast
either outside of ASID macro call or inside of it.
Personally I prefer variant that follows this cover
letter, because it is similar to other cases where 
such code pattern is used.

I've tried to understand whether proposed fix is really 
workaround for compiler bug. I failed to find clear 
explanation how gcc need to handle such situation.
Here is quote from gcc manual:

> The compiler cannot
> check whether the operands have data types that are reasonable for the
> instruction being executed.  It does not parse the assembler instruction
> template and does not know what it means or even whether it is valid
> assembler input.  The extended `asm' feature is most often used for
> machine instructions the compiler itself does not know exist.  If the
> output expression cannot be directly addressed (for example, it is a
> bit-field), your constraint must allow a register.  In that case, GCC
> will use the register as the output of the `asm', and then store that
> register into the output.

It is not clear whether 'unsigned long long' is 
"reasonable" type for arm "r" inline asm operand. So
it seems it is gray area. Personally I suspect this
issue came up before because in many places of tlbflush.h
'zero' local variable is used and 'asid' local
variable is used.

Here is small test case that illustrate code
generation issue and difference between LE and BE cases:

[kamensky at kamensky-w530 asid_inline]$ cat asid_inline.c
typedef unsigned long long u64;

struct mm_struct {
    unsigned long dummy1;
    u64 __attribute__((aligned(8))) counter;
};

void test1(struct mm_struct *mm)
{
    const int asid = ((mm)->counter & ~((~0ULL) << 8));
    do {
        asm("mcr " "p15, 0, %0, " "c8, c3, 2" : : "r" (asid) : "cc");
    } while (0);
}

void test2(struct mm_struct *mm)
{
    do {
        asm("mcr " "p15, 0, %0, " "c8, c3, 2" : : "r" (((mm)->counter & ~((~0ULL) << 8))) : "cc");
       
    } while (0);
}

void test3(struct mm_struct *mm)
{
    do {
        asm("mcr " "p15, 0, %0, " "c8, c3, 2" : : "r" ((int)((mm)->counter & ~((~0ULL) << 8))) : "cc");
       
    } while (0);
}

[kamensky at kamensky-w530 asid_inline]$ ./asid_inline.sh
+ arm-linux-gnueabihf-gcc -nostdinc -mbig-endian -O2 -mabi=aapcs-linux -mno-thumb-interwork -mthumb -march=armv7-a -msoft-float -c -o asid_inline.be.o asid_inline.c
+ arm-linux-gnueabihf-objdump --disassemble --reloc asid_inline.be.o

asid_inline.be.o:     file format elf32-bigarm


Disassembly of section .text:

00000000 <test1>:
   0:	7bc3      	ldrb	r3, [r0, #15]
   2:	ee08 3f53 	mcr	15, 0, r3, cr8, cr3, {2}
   6:	4770      	bx	lr

00000008 <test2>:
   8:	7bc3      	ldrb	r3, [r0, #15]
   a:	2200      	movs	r2, #0
   c:	ee08 2f53 	mcr	15, 0, r2, cr8, cr3, {2}
  10:	4770      	bx	lr
  12:	bf00      	nop

00000014 <test3>:
  14:	7bc3      	ldrb	r3, [r0, #15]
  16:	ee08 3f53 	mcr	15, 0, r3, cr8, cr3, {2}
  1a:	4770      	bx	lr
+ arm-linux-gnueabihf-gcc -nostdinc -mlittle-endian -O2 -mabi=aapcs-linux -mno-thumb-interwork -mthumb -march=armv7-a -msoft-float -c -o asid_inline.le.o asid_inline.c
+ arm-linux-gnueabihf-objdump --disassemble --reloc asid_inline.le.o

asid_inline.le.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <test1>:
   0:	7a03      	ldrb	r3, [r0, #8]
   2:	ee08 3f53 	mcr	15, 0, r3, cr8, cr3, {2}
   6:	4770      	bx	lr

00000008 <test2>:
   8:	7a02      	ldrb	r2, [r0, #8]
   a:	2300      	movs	r3, #0
   c:	ee08 2f53 	mcr	15, 0, r2, cr8, cr3, {2}
  10:	4770      	bx	lr
  12:	bf00      	nop

00000014 <test3>:
  14:	7a03      	ldrb	r3, [r0, #8]
  16:	ee08 3f53 	mcr	15, 0, r3, cr8, cr3, {2}
  1a:	4770      	bx	lr
[kamensky at kamensky-w530 asid_inline]$ 

Thanks,
Victor

Victor Kamensky (1):
  ARM: tlb: __flush_tlb_mm need to use int asid var for BE correct
    operation

 arch/arm/include/asm/tlbflush.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
1.8.1.4




More information about the linux-arm-kernel mailing list