KASAN issues with idle / hotplug area (was: Re: [PATCH v5sub1 7/8] arm64: move kernel image to base of vmalloc area)

Mark Rutland mark.rutland at arm.com
Wed Feb 17 09:01:11 PST 2016

On Wed, Feb 17, 2016 at 02:39:51PM +0000, Mark Rutland wrote:
> When we go down for idle, in __cpu_suspend_enter we stash some context
> to the stack (in assembly). The CPU may return from a cold state via
> cpu_resume, where we restore context from the stack.
> However, after storing the context we call psci_suspend_finisher, which
> calls psci_cpu_suspend, which calls invoke_psci_fn_*. As
> psci_cpu_suspend and invoke_psci_fn_* are instrumented, they poison
> memory on function entrance, but we never perform the unpoisoning.
> That was always the case for psci_suspend_finisher, so there was a
> latent issue that we were somehow avoiding. Perhaps we got luck with
> stack layout and never hit the poison.
> I'm not sure how we fix that, as invoke_psci_fn_* may or may not return
> for arbitrary reasons (e.g. a CPU_SUSPEND_CALL may or may not return
> depending on whether an interrupt comes in at the right time).
> Perhaps the simplest option is to not instrument invoke_psci_fn_* and
> psci_suspend_finisher. Do we have a per-function annotation to avoid
> KASAN instrumentation, like notrace? I need to investigate, but we may
> also need notrace for similar reasons.

I found __no_sanitize_address.

As an aside, could we rename that to nokasan? That would match the style
of notrace, is just as clear, and would make it far easier to write
consistent legible function prototypes...

Otherwise, I came up with the patch below, per the reasoning above.

It _changes_ the KASAN splats (I see errors in tick_program_event rather
than find_busiest_group), but doesn't seem to get rid of them. I'm not
sure if I've missed something, or if we also have another latent issue.



>From 8f7ae44d8f8862f5300483d45617b5bd05fc652f Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland at arm.com>
Date: Wed, 17 Feb 2016 15:38:22 +0000
Subject: [PATCH] arm64/psci: avoid KASAN splats with idle

When a CPU goes into a deep idle state, we store CPU context in
__cpu_suspend_enter, then call psci_suspend_finisher to invoke the
firmware. If we entered a deep idle state, we do not return directly,
and instead start cold, restoring state in cpu_resume.

Thus we may execute the prologue and body of psci_suspend_finisher and
the PSCI invocation function, but not their epilogue. When using KASAN
this means that we poison a region of shadow memory, but never unpoison
it. After we resume, subsequent stack accesses may hit the stale poison
values, leading to false positives from KASAN.

To avoid this, we must ensure that functions called after the context
save are not instrumented, and do not posion the shadow region, by
annotating them with __no_sanitize_address. As common inlines they may
call are not similarly annotated, and the compiler refuses to allow
function attribute mismatches, we must also avoid calls to such

ARM is not affected, as it does not support KASAN. When CONFIG_KASAN is
not selected, __no_sanitize_address expands to nothing, so the
annotation should not be harmful.

Signed-off-by: Mark Rutland <mark.rutland at arm.com>
 arch/arm64/kernel/psci.c | 14 ++++++++------
 drivers/firmware/psci.c  |  3 +++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index f67f35b..8324ce8 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -32,12 +32,16 @@
 static DEFINE_PER_CPU_READ_MOSTLY(u32 *, psci_power_state);
+static phys_addr_t cpu_resume_phys;
 static int __maybe_unused cpu_psci_cpu_init_idle(unsigned int cpu)
 	int i, ret, count = 0;
 	u32 *psci_states;
 	struct device_node *state_node, *cpu_node;
+	cpu_resume_phys = virt_to_phys(cpu_resume);
 	cpu_node = of_get_cpu_node(cpu, NULL);
 	if (!cpu_node)
 		return -ENODEV;
@@ -178,12 +182,10 @@ static int cpu_psci_cpu_kill(unsigned int cpu)
-static int psci_suspend_finisher(unsigned long index)
+static int psci_suspend_finisher(unsigned long state)
-	u32 *state = __this_cpu_read(psci_power_state);
-	return psci_ops.cpu_suspend(state[index - 1],
-				    virt_to_phys(cpu_resume));
+	return psci_ops.cpu_suspend(state, cpu_resume_phys);
 static int __maybe_unused cpu_psci_cpu_suspend(unsigned long index)
@@ -200,7 +202,7 @@ static int __maybe_unused cpu_psci_cpu_suspend(unsigned long index)
 	if (!psci_power_state_loses_context(state[index - 1]))
 		ret = psci_ops.cpu_suspend(state[index - 1], 0);
-		ret = cpu_suspend(index, psci_suspend_finisher);
+		ret = cpu_suspend(state[index - 1], psci_suspend_finisher);
 	return ret;
diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index f25cd79..e4e8dc1 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -106,6 +106,7 @@ bool psci_power_state_is_valid(u32 state)
 	return !(state & ~valid_mask);
 static unsigned long __invoke_psci_fn_hvc(unsigned long function_id,
 			unsigned long arg0, unsigned long arg1,
 			unsigned long arg2)
@@ -116,6 +117,7 @@ static unsigned long __invoke_psci_fn_hvc(unsigned long function_id,
 	return res.a0;
 static unsigned long __invoke_psci_fn_smc(unsigned long function_id,
 			unsigned long arg0, unsigned long arg1,
 			unsigned long arg2)
@@ -148,6 +150,7 @@ static u32 psci_get_version(void)
 	return invoke_psci_fn(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
 static int psci_cpu_suspend(u32 state, unsigned long entry_point)
 	int err;

More information about the linux-arm-kernel mailing list