[PATCH] ARM: mm: fix stack corruption when CONFIG_ARM_PV_FIXUP=y

Zhi-zhou Zhang zhizhou.zh at gmail.com
Sat Sep 9 01:23:45 PDT 2023


On Fri, Sep 08, 2023 at 11:00:31PM +0200, Linus Walleij wrote:
> On Fri, Sep 8, 2023 at 3:50 PM Russell King (Oracle)
> <linux at armlinux.org.uk> wrote:
> 
> > However, it makes a total nonsense of the comment, which explains
> > precisely why the flush_cache_all() is where it is. Moving it before
> > that comment means that the comment is now rediculous.
> 
> Zhizhou, can you look over the comment placement?

Linus, I found the bug on a cortex-a55 cpu with high address memory.
Since the lr is also corruptted, when flush_cache_all() is done, the
program continues at the next instruction after fixup_pv_table(). So
the disabling cache and flush_cache_all() is executed a secondary time.
Then this time lr is correct so the kernel may boot up as usual.

I read the comment carefully, I am not sure how "to ensure nothing is
prefetched into the caches" affects the system. My patch doesn't
prevent instrution prefetch though. But in my board everythings looks
good.

So I come up with a new fixup plan, that's keep the location of
flush_cache_all() with adding a flush stack cache before disabling
cache, the code is as follow, the fix is a bit ugly -- it makes
assumption stack grow towards low address and flush_cache_all() will
not occupy more than 32 bytes in the future. Comparing with move
flush_cache_all() before disabling cache, Which one do you prefer?
Thanks!

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 03fb0fe926f3..83a54c61a86b 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1640,6 +1640,7 @@ static void __init early_paging_init(const struct machine_desc *mdesc)
 	unsigned long pa_pgd;
 	unsigned int cr, ttbcr;
 	long long offset;
+	void *stack;
 
 	if (!mdesc->pv_fixup)
 		return;
@@ -1675,7 +1676,14 @@ static void __init early_paging_init(const struct machine_desc *mdesc)
 	/* Run the patch stub to update the constants */
 	fixup_pv_table(&__pv_table_begin,
 		(&__pv_table_end - &__pv_table_begin) << 2);
-	flush_cache_all();
+
+	/*
+	 * clean stack in cacheline that undering memory will be changed in
+	 * the following flush_cache_all(). assuming 32 bytes is enough for
+	 * flush_cache_all().
+	 */
+	stack = (void *) (current_stack_pointer - 32);
+	__cpuc_flush_dcache_area(stack, 32);
 
 	/*
 	 * We changing not only the virtual to physical mapping, but also
@@ -1691,6 +1699,7 @@ static void __init early_paging_init(const struct machine_desc *mdesc)
 	asm("mrc p15, 0, %0, c2, c0, 2" : "=r" (ttbcr));
 	asm volatile("mcr p15, 0, %0, c2, c0, 2"
 		: : "r" (ttbcr & ~(3 << 8 | 3 << 10)));
+	flush_cache_all();
 
 	/*
 	 * Fixup the page tables - this must be in the idmap region as

> 
> > So, please don't put it in the patch system.
> >
> > The patch certainly needs to be tested on TI Keystone which is the
> > primary user of this code.
> 
> Added Andrew Davis and Nishanth Menon to the thread:
> can you folks review and test this for Keystone?
> 
> Yours,
> Linus Walleij



More information about the linux-arm-kernel mailing list