[PATCH 01/15] ARM: assembler: introduce adr_l, ldr_l and str_l macros
Ard Biesheuvel
ard.biesheuvel at linaro.org
Tue Aug 8 08:19:24 PDT 2017
On 8 August 2017 at 16:10, Nicolas Pitre <nicolas.pitre at linaro.org> wrote:
> On Sat, 5 Aug 2017, Ard Biesheuvel wrote:
>
>> Like arm64, ARM supports position independent code sequences that
>> produce symbol references with a greater reach than the ordinary
>> adr/ldr instructions.
>>
>> Currently, we use open coded instruction sequences involving literals
>> and arithmetic operations. Instead, we can use movw/movt pairs on v7
>> CPUs, circumventing the D-cache entirely. For older CPUs, we can emit
>> the literal into a subsection, allowing it to be emitted out of line
>> while retaining the ability to perform arithmetic on label offsets.
>>
>> E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:
>>
>> ldr <reg>, 222f
>> 111: add <reg>, <reg>, pc
>> .subsection 1
>> 222: .long <sym> - (111b + 8)
>> .previous
>>
>> This is allowed by the assembler because, unlike ordinary sections,
>> subsections are combined into a single section into the object file,
>> and so the label references are not true cross-section references that
>> are visible as relocations. Note that we could even do something like
>>
>> add <reg>, pc, #(222f - 111f) & ~0xfff
>> ldr <reg>, [<reg>, #(222f - 111f) & 0xfff]
>> 111: add <reg>, <reg>, pc
>> .subsection 1
>> 222: .long <sym> - (111b + 8)
>> .previous
>>
>> if it turns out that the 4 KB range of the ldr instruction is insufficient
>> to reach the literal in the subsection, although this is currently not a
>> problem (of the 98 objects built from .S files in a multi_v7_defconfig
>> build, only 11 have .text sections that are over 1 KB, and the largest one
>> [entry-armv.o] is 3308 bytes)
>>
>> Subsections have been available in binutils since 2004 at least, so
>> they should not cause any issues with older toolchains.
>>
>> So use the above to implement the macros mov_l, adr_l, adrm_l (using ldm
>> to load multiple literals at once), ldr_l and str_l, all of which will
>> use movw/movt pairs on v7 and later CPUs, and use PC-relative literals
>> otherwise.
>
> There is no adrm_l definition in this patch.
>
Ah yes, I played around with it but it becomes a bit clunky so I removed it:
adrl <reg1>, 222f
ldm <reg1>, {<reg1>, <reg2>}
111: add <reg1>, <reg1>, pc
add <reg2>, <reg2>, pc
.subsection 1
222: .long <sym1> - (111b + 8)
.long <sym2> - (111b + 12)
.previous
The adrl pseudo op always assembles to two instructions, so you need 5
instructions while using adr_l twice uses only 4. I am not sure if
eliminating one of the loads would make a huge difference, given that
there are no use cases for adrm_l on hot paths, at least not in this
series.
> Also, might it be better to change mov_l to movl? Tthis looks similar to
> the ARM64 movl pseudo-instruction, and unlike all the other _l variants,
> this is not producing a pc relative result.
>
On arm64, we have mov_q for a 64-bit absolute load, and I thought
mov_l was less confusing than mov_w. In general, I like the underscore
in the middle because on the one hand, it looks like a ordinary
mnemonic but on the other hand, it is obvious that it is not a true
instruction. mov_abs perhaps?
> Talking about the _l suffix: I wonder if this could be more meaningful,
> like _rel maybe? At least in the adr_l case, this could easily be
> confused with adrl.
>
On arm64, we have ldr_l, str_l and adr_l as well, and I usually try to
align between ARM and arm64 if I can.
> Otherwise I like it pretty much.
>
Thanks!
More information about the linux-arm-kernel
mailing list