[PATCH v4 7/7] ARM: implement support for vmap'ed stacks

Ard Biesheuvel ardb at kernel.org
Tue Dec 21 08:20:05 PST 2021


On Tue, 21 Dec 2021 at 14:51, Marek Szyprowski <m.szyprowski at samsung.com> wrote:
>
> Hi,
>
> On 21.12.2021 14:34, Ard Biesheuvel wrote:
> > On Tue, 21 Dec 2021 at 12:15, Marek Szyprowski <m.szyprowski at samsung.com> wrote:
> >> Hi Ard,
> >>
> >> On 21.12.2021 11:44, Ard Biesheuvel wrote:
> >>> On Tue, 21 Dec 2021 at 11:39, Marek Szyprowski <m.szyprowski at samsung.com> wrote:
> >>>> On 22.11.2021 10:28, Ard Biesheuvel wrote:
> >>>>> Wire up the generic support for managing task stack allocations via vmalloc,
> >>>>> and implement the entry code that detects whether we faulted because of a
> >>>>> stack overrun (or future stack overrun caused by pushing the pt_regs array)
> >>>>>
> >>>>> While this adds a fair amount of tricky entry asm code, it should be
> >>>>> noted that it only adds a TST + branch to the svc_entry path. The code
> >>>>> implementing the non-trivial handling of the overflow stack is emitted
> >>>>> out-of-line into the .text section.
> >>>>>
> >>>>> Since on ARM, we rely on do_translation_fault() to keep PMD level page
> >>>>> table entries that cover the vmalloc region up to date, we need to
> >>>>> ensure that we don't hit such a stale PMD entry when accessing the
> >>>>> stack. So we do a dummy read from the new stack while still running from
> >>>>> the old one on the context switch path, and bump the vmalloc_seq counter
> >>>>> when PMD level entries in the vmalloc range are modified, so that the MM
> >>>>> switch fetches the latest version of the entries.
> >>>>>
> >>>>> Note that we need to increase the per-mode stack by 1 word, to gain some
> >>>>> space to stash a GPR until we know it is safe to touch the stack.
> >>>>> However, due to the cacheline alignment of the struct, this does not
> >>>>> actually increase the memory footprint of the struct stack array at all.
> >>>>>
> >>>>> Signed-off-by: Ard Biesheuvel <ardb at kernel.org>
> >>>>> Tested-by: Keith Packard <keithpac at amazon.com>
> >>>> This patch landed recently in linux-next 20211220 as commit a1c510d0adc6
> >>>> ("ARM: implement support for vmap'ed stacks"). Sadly it breaks
> >>>> suspend/resume operation on all ARM 32bit Exynos SoCs. Probably the
> >>>> suspend/resume related code must be updated somehow (it partially works
> >>>> on physical addresses and disabled MMU), but I didn't analyze it yet. If
> >>>> you have any hints, let me know.
> >>>>
> >>> Are there any such systems in KernelCI? We caught a suspend/resume
> >>> related issue in development, which is why the hunk below was added.
> >>
> >> I think that some Exynos-based Odroids (U3 and XU3) were some time ago
> >> available in KernelCI, but I don't know if they are still there.
> >>
> >>
> >>> In general, any virt-to-phys translation involving and address on the
> >>> stack will become problematic.
> >>>
> >>> Could you please confirm whether the issue persists with the patch
> >>> applied but with CONFIG_VMAP_STACK turned off? Just so we know we are
> >>> looking in the right place?
> >>
> >> I've just checked. After disabling CONFIG_VMAP_STACK suspend/resume
> >> works fine both on commit a1c510d0adc6 and linux-next 20211220.
> >>
> > Thanks. Any other context you can provide beyond 'does not work' ?
>
> Well, the board properly suspends, but it doesn't wake then (tested
> remotely with rtcwake command). So far I cannot provide anything more.
>

Thanks. Does the below help? Or otherwise, could you try doubling the
size of the overflow stack at arch/arm/include/asm/thread_info.h:34?


diff --git a/arch/arm/kernel/sleep.S b/arch/arm/kernel/sleep.S
index b062b3738bc6..a59bd03a3f2e 100644
--- a/arch/arm/kernel/sleep.S
+++ b/arch/arm/kernel/sleep.S
@@ -67,7 +67,7 @@ ENTRY(__cpu_suspend)
        ldr     r4, =cpu_suspend_size
 #endif
        mov     r5, sp                  @ current virtual SP
-#ifdef CONFIG_VMAP_STACK
+#if 0 //def CONFIG_VMAP_STACK
        @ Run the suspend code from the overflow stack so we don't have to rely
        @ on vmalloc-to-phys conversions anywhere in the arch suspend code.
        @ The original SP value captured in R5 will be restored on the way out.
diff --git a/arch/arm/kernel/suspend.c b/arch/arm/kernel/suspend.c
index 43f0a3ebf390..ab1218ac5b4a 100644
--- a/arch/arm/kernel/suspend.c
+++ b/arch/arm/kernel/suspend.c
@@ -76,7 +76,9 @@ void __cpu_suspend_save(u32 *ptr, u32 ptrsz, u32 sp,
u32 *save_ptr)
 {
        u32 *ctx = ptr;

-       *save_ptr = virt_to_phys(ptr);
+       *save_ptr = IS_ENABLED(CONFIG_VMAP_STACK)
+                   ? __pfn_to_phys(vmalloc_to_pfn(ptr)) + offset_in_page(ptr)
+                   : virt_to_phys(ptr);

        /* This must correspond to the LDM in cpu_resume() assembly */
        *ptr++ = virt_to_phys(idmap_pgd);



More information about the linux-arm-kernel mailing list