[PATCH 1/6] arm64: vdso: Disable dwarf unwinding through the sigreturn trampoline

Dave Martin Dave.Martin at arm.com
Tue Jun 23 05:59:56 EDT 2020


On Tue, Jun 23, 2020 at 09:54:31AM +0100, Will Deacon wrote:
> Commit 7e9f5e6629f6 ("arm64: vdso: Add --eh-frame-hdr to ldflags") results
> in a .eh_frame_hdr section for the vDSO, which in turn causes the libgcc
> unwinder to unwind out of signal handlers using the .eh_frame information
> populated by our .cfi directives. In conjunction with a4eb355a3fda
> ("arm64: vdso: Fix CFI directives in sigreturn trampoline"), this has
> been shown to cause segmentation faults originating from within the
> unwinder during thread cancellation:
> 
>  | Thread 14 "virtio-net-rx" received signal SIGSEGV, Segmentation fault.
>  | 0x0000000000435e24 in uw_frame_state_for ()
>  | (gdb) bt
>  | #0  0x0000000000435e24 in uw_frame_state_for ()
>  | #1  0x0000000000436e88 in _Unwind_ForcedUnwind_Phase2 ()
>  | #2  0x00000000004374d8 in _Unwind_ForcedUnwind ()
>  | #3  0x0000000000428400 in __pthread_unwind (buf=<optimized out>) at unwind.c:121
>  | #4  0x0000000000429808 in __do_cancel () at ./pthreadP.h:304
>  | #5  sigcancel_handler (sig=32, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:200
>  | #6  sigcancel_handler (sig=<optimized out>, si=0xffff33c743f0, ctx=<optimized out>) at nptl-init.c:165
>  | #7  <signal handler called>
>  | #8  futex_wait_cancelable (private=0, expected=0, futex_word=0x3890b708) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
> 
> After considerable bashing of heads, it appears that our CFI directives
> for unwinding out of the sigreturn trampoline are only processed by libgcc
> when both a .eh_frame_hdr section is present *and* the mysterious NOP is
> covered by an entry in .eh_frame. With both of these now in place, it has
> highlighted that our CFI directives are not comprehensive enough to
> restore the stack pointer of the interrupted context. This results in libgcc
> falling back to an arm64-specific unwinder after computing a bogus PC value
> from the unwind tables. The unwinder promptly dereferences this bogus address
> in an attempt to see if the pointed-to instruction sequence looks like
> the sigreturn trampoline.
> 
> Restore the old unwind behaviour, which relied solely on heuristics in
> the unwinder, by removing the .eh_frame_hdr section from the vDSO and
> commenting out the insufficient CFI directives for now. Add comments to
> explain the current, miserable state of affairs.
> 
> Cc: Vincenzo Frascino <vincenzo.frascino at arm.com>
> Cc: Tamas Zsoldos <tamas.zsoldos at arm.com>
> Cc: Szabolcs Nagy <szabolcs.nagy at arm.com>
> Cc: Catalin Marinas <catalin.marinas at arm.com>
> Cc: Daniel Kiss <daniel.kiss at arm.com>
> Cc: Dave Martin <dave.martin at arm.com>
> Reported-by: Ard Biesheuvel <ardb at kernel.org>
> Signed-off-by: Will Deacon <will at kernel.org>
> ---
>  arch/arm64/kernel/vdso/Makefile    |  2 +-
>  arch/arm64/kernel/vdso/sigreturn.S | 54 +++++++++++++++++++-----------
>  2 files changed, 35 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
> index 556d424c6f52..4a2e06181d80 100644
> --- a/arch/arm64/kernel/vdso/Makefile
> +++ b/arch/arm64/kernel/vdso/Makefile
> @@ -24,7 +24,7 @@ btildflags-$(CONFIG_ARM64_BTI_KERNEL) += -z force-bti
>  # routines, as x86 does (see 6f121e548f83 ("x86, vdso: Reimplement vdso.so
>  # preparation in build-time C")).
>  ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv \
> -		-Bsymbolic --eh-frame-hdr --build-id -n $(btildflags-y) -T
> +		-Bsymbolic --build-id -n $(btildflags-y) -T
>  
>  ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18
>  ccflags-y += -DDISABLE_BRANCH_PROFILING
> diff --git a/arch/arm64/kernel/vdso/sigreturn.S b/arch/arm64/kernel/vdso/sigreturn.S
> index 620a3ef837b7..0e18729abc3b 100644
> --- a/arch/arm64/kernel/vdso/sigreturn.S
> +++ b/arch/arm64/kernel/vdso/sigreturn.S
> @@ -18,29 +18,40 @@
>  
>  	.text
>  
> +/*
> + * NOTE!!!  You may notice that all of the .cfi directives in this file have
> + * been commented out. This is because they have been shown to trigger segfaults
> + * in libgcc when unwinding out of a SIGCANCEL handler to invoke pthread
> + * cleanup handlers during the thread cancellation dance. By omitting the
> + * directives, we trigger an arm64-specific fallback path in the unwinder which
> + * recognises the signal frame and restores many of the registers directly from
> + * the sigcontext. Re-enabling the cfi directives here therefore needs to be
> + * much more comprehensive to reduce the risk of further regressions.
> + */
> +
>  /* Ensure that the mysterious NOP can be associated with a function. */
> -	.cfi_startproc
> +//	.cfi_startproc
>  
>  /*
> - * .cfi_signal_frame causes the corresponding Frame Description Entry in the
> - * .eh_frame section to be annotated as a signal frame. This allows DWARF
> - * unwinders (e.g. libstdc++) to implement _Unwind_GetIPInfo(), which permits
> - * unwinding out of the signal trampoline without the need for the mysterious
> - * NOP.
> + * .cfi_signal_frame causes the corresponding Frame Description Entry (FDE) in
> + * the .eh_frame section to be annotated as a signal frame. This allows DWARF
> + * unwinders (e.g. libstdc++) to implement _Unwind_GetIPInfo() and identify
> + * the next frame using the unmodified return address instead of subtracting 1,
> + * which may yield the wrong FDE.
>   */
> -	.cfi_signal_frame
> +//	.cfi_signal_frame
>  
>  /*
>   * Tell the unwinder where to locate the frame record linking back to the
> - * interrupted context. We don't provide unwind info for registers other
> - * than the frame pointer and the link register here; in practice, this
> - * is sufficient for unwinding in C/C++ based runtimes and the values in
> - * the sigcontext may have been modified by this point anyway. Debuggers
> + * interrupted context. We don't provide unwind info for registers other than
> + * the frame pointer and the link register here; in practice, this is likely to
> + * be insufficient for unwinding in C/C++ based runtimes, especially without a
> + * means to restore the stack pointer. Thankfully, unwinders and debuggers
>   * already have baked-in strategies for attempting to unwind out of signals.
>   */
> -	.cfi_def_cfa    x29, 0
> -	.cfi_offset     x29, 0 * 8
> -	.cfi_offset     x30, 1 * 8
> +//	.cfi_def_cfa    x29, 0
> +//	.cfi_offset     x29, 0 * 8
> +//	.cfi_offset     x30, 1 * 8

I guess this will help is to demonstrate that nobody is relying on
detailed annotations here...

>  /*
>   * This mysterious NOP is required for some unwinders (e.g. libc++) that
> @@ -51,16 +62,19 @@
>  	nop	// Mysterious NOP
>  
>  /*
> - * GDB relies on being able to identify the sigreturn instruction sequence to
> - * unwind from signal handlers. We cannot, therefore, use SYM_FUNC_START()
> - * here, as it will emit a BTI C instruction and break the unwinder. Thankfully,
> - * this function is only ever called from a RET and so omitting the landing pad
> - * is perfectly fine.
> + * GDB, libgcc and libunwind rely on being able to identify the sigreturn
> + * instruction sequence to unwind from signal handlers. We cannot, therefore,
> + * use SYM_FUNC_START() here, as it will emit a BTI C instruction and break the
> + * unwinder. Thankfully, this function is only ever called from a RET and so
> + * omitting the landing pad is perfectly fine.
>   */
>  SYM_CODE_START(__kernel_rt_sigreturn)
> +//	PLEASE DO NOT MODIFY
>  	mov	x8, #__NR_rt_sigreturn
> +//	PLEASE DO NOT MODIFY
>  	svc	#0
> -	.cfi_endproc
> +//	PLEASE DO NOT MODIFY
> +//	.cfi_endproc
>  SYM_CODE_END(__kernel_rt_sigreturn)

FWIW,
Acked-by: Dave Martin <Dave.Martin at arm.com>




More information about the linux-arm-kernel mailing list