[PATCH v3 02/13] ARM: ftrace: use ADD not POP to counter PUSH at entry

Ard Biesheuvel ardb at kernel.org
Thu Feb 3 00:21:53 PST 2022


The compiler emitted hook used for ftrace consists of a PUSH {LR} to
preserve the link register, followed by a branch-and-link (BL) to
__gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
combined sequence into a NOP, using a POP {LR} instruction.

This is not necessary, since the link register does not get clobbered in
this case, and simply adding #4 to the stack pointer is sufficient, and
avoids a memory access that may take a few cycles to resolve depending
on the micro-architecture.

Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers at google.com>
Reviewed-by: Linus Walleij <linus.walleij at linaro.org>
Reviewed-by: Steven Rostedt (Google) <rostedt at goodmis.org>
---
 arch/arm/kernel/entry-ftrace.S |  2 +-
 arch/arm/kernel/ftrace.c       | 15 +++++++++++++--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index f4886fb6e9ba..dca12a09322a 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -27,7 +27,7 @@
  * allows it to be clobbered in subroutines and doesn't use it to hold
  * parameters.)
  *
- * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}"
+ * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
  * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
  */
 
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index a006585e1c09..811cadf7eefc 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -24,10 +24,21 @@
 #include <asm/set_memory.h>
 #include <asm/patch.h>
 
+/*
+ * The compiler emitted profiling hook consists of
+ *
+ *   PUSH    {LR}
+ *   BL	     __gnu_mcount_nc
+ *
+ * To turn this combined sequence into a NOP, we need to restore the value of
+ * SP before the PUSH. Let's use an ADD rather than a POP into LR, as LR is not
+ * modified anyway, and reloading LR from memory is highly likely to be less
+ * efficient.
+ */
 #ifdef CONFIG_THUMB2_KERNEL
-#define	NOP		0xf85deb04	/* pop.w {lr} */
+#define	NOP		0xf10d0d04	/* add.w sp, sp, #4 */
 #else
-#define	NOP		0xe8bd4000	/* pop {lr} */
+#define	NOP		0xe28dd004	/* add   sp, sp, #4 */
 #endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE
-- 
2.30.2




More information about the linux-arm-kernel mailing list