[PATCH v3 02/13] ARM: ftrace: use ADD not POP to counter PUSH at entry
Ard Biesheuvel
ardb at kernel.org
Thu Feb 3 00:21:53 PST 2022
The compiler emitted hook used for ftrace consists of a PUSH {LR} to
preserve the link register, followed by a branch-and-link (BL) to
__gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
combined sequence into a NOP, using a POP {LR} instruction.
This is not necessary, since the link register does not get clobbered in
this case, and simply adding #4 to the stack pointer is sufficient, and
avoids a memory access that may take a few cycles to resolve depending
on the micro-architecture.
Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers at google.com>
Reviewed-by: Linus Walleij <linus.walleij at linaro.org>
Reviewed-by: Steven Rostedt (Google) <rostedt at goodmis.org>
---
arch/arm/kernel/entry-ftrace.S | 2 +-
arch/arm/kernel/ftrace.c | 15 +++++++++++++--
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index f4886fb6e9ba..dca12a09322a 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -27,7 +27,7 @@
* allows it to be clobbered in subroutines and doesn't use it to hold
* parameters.)
*
- * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}"
+ * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
* instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
*/
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index a006585e1c09..811cadf7eefc 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -24,10 +24,21 @@
#include <asm/set_memory.h>
#include <asm/patch.h>
+/*
+ * The compiler emitted profiling hook consists of
+ *
+ * PUSH {LR}
+ * BL __gnu_mcount_nc
+ *
+ * To turn this combined sequence into a NOP, we need to restore the value of
+ * SP before the PUSH. Let's use an ADD rather than a POP into LR, as LR is not
+ * modified anyway, and reloading LR from memory is highly likely to be less
+ * efficient.
+ */
#ifdef CONFIG_THUMB2_KERNEL
-#define NOP 0xf85deb04 /* pop.w {lr} */
+#define NOP 0xf10d0d04 /* add.w sp, sp, #4 */
#else
-#define NOP 0xe8bd4000 /* pop {lr} */
+#define NOP 0xe28dd004 /* add sp, sp, #4 */
#endif
#ifdef CONFIG_DYNAMIC_FTRACE
--
2.30.2
More information about the linux-arm-kernel
mailing list