[PATCH 01/15] ARM: assembler: introduce adr_l, ldr_l and str_l macros
Nicolas Pitre
nicolas.pitre at linaro.org
Tue Aug 8 08:39:19 PDT 2017
On Tue, 8 Aug 2017, Ard Biesheuvel wrote:
> On 8 August 2017 at 16:10, Nicolas Pitre <nicolas.pitre at linaro.org> wrote:
> > On Sat, 5 Aug 2017, Ard Biesheuvel wrote:
> >
> >> Like arm64, ARM supports position independent code sequences that
> >> produce symbol references with a greater reach than the ordinary
> >> adr/ldr instructions.
> >>
> >> Currently, we use open coded instruction sequences involving literals
> >> and arithmetic operations. Instead, we can use movw/movt pairs on v7
> >> CPUs, circumventing the D-cache entirely. For older CPUs, we can emit
> >> the literal into a subsection, allowing it to be emitted out of line
> >> while retaining the ability to perform arithmetic on label offsets.
> >>
> >> E.g., on pre-v7 CPUs, we can emit a PC-relative reference as follows:
> >>
> >> ldr <reg>, 222f
> >> 111: add <reg>, <reg>, pc
> >> .subsection 1
> >> 222: .long <sym> - (111b + 8)
> >> .previous
> >>
> >> This is allowed by the assembler because, unlike ordinary sections,
> >> subsections are combined into a single section into the object file,
> >> and so the label references are not true cross-section references that
> >> are visible as relocations. Note that we could even do something like
> >>
> >> add <reg>, pc, #(222f - 111f) & ~0xfff
> >> ldr <reg>, [<reg>, #(222f - 111f) & 0xfff]
> >> 111: add <reg>, <reg>, pc
> >> .subsection 1
> >> 222: .long <sym> - (111b + 8)
> >> .previous
> >>
> >> if it turns out that the 4 KB range of the ldr instruction is insufficient
> >> to reach the literal in the subsection, although this is currently not a
> >> problem (of the 98 objects built from .S files in a multi_v7_defconfig
> >> build, only 11 have .text sections that are over 1 KB, and the largest one
> >> [entry-armv.o] is 3308 bytes)
> >>
> >> Subsections have been available in binutils since 2004 at least, so
> >> they should not cause any issues with older toolchains.
> >>
> >> So use the above to implement the macros mov_l, adr_l, adrm_l (using ldm
> >> to load multiple literals at once), ldr_l and str_l, all of which will
> >> use movw/movt pairs on v7 and later CPUs, and use PC-relative literals
> >> otherwise.
> >
> > There is no adrm_l definition in this patch.
> >
>
> Ah yes, I played around with it but it becomes a bit clunky so I removed it:
>
> adrl <reg1>, 222f
> ldm <reg1>, {<reg1>, <reg2>}
> 111: add <reg1>, <reg1>, pc
> add <reg2>, <reg2>, pc
> .subsection 1
> 222: .long <sym1> - (111b + 8)
> .long <sym2> - (111b + 12)
> .previous
>
> The adrl pseudo op always assembles to two instructions, so you need 5
> instructions while using adr_l twice uses only 4. I am not sure if
> eliminating one of the loads would make a huge difference, given that
> there are no use cases for adrm_l on hot paths, at least not in this
> series.
I'd suggest you keep it to a minimum. Using adr_l twice is clear and
obvious.
> > Also, might it be better to change mov_l to movl? Tthis looks similar to
> > the ARM64 movl pseudo-instruction, and unlike all the other _l variants,
> > this is not producing a pc relative result.
> >
>
> On arm64, we have mov_q for a 64-bit absolute load, and I thought
> mov_l was less confusing than mov_w. In general, I like the underscore
> in the middle because on the one hand, it looks like a ordinary
> mnemonic but on the other hand, it is obvious that it is not a true
> instruction. mov_abs perhaps?
>
> > Talking about the _l suffix: I wonder if this could be more meaningful,
> > like _rel maybe? At least in the adr_l case, this could easily be
> > confused with adrl.
> >
>
> On arm64, we have ldr_l, str_l and adr_l as well, and I usually try to
> align between ARM and arm64 if I can.
OK. I'm much less versed into ARM64 assembly so I'll defer to your
judgment. It's good if this mnemonic scheme already exists there with
a similar meaning.
Nicolas
More information about the linux-arm-kernel
mailing list