[PATCH 0/6] Fix unwinding through sigreturn trampolines

Tue Jun 23 11:34:37 EDT 2020

Hi Will,

The CFI is correct for PowerPC[1] and X86[2] at least; I did not check others.
so LLVM’s unwinder can unwind/backtrace sigreturn without any specific hack on those architectures.
The a4eb355a3fda change implements the same CFI as PowerPC and X86 have so the generic unwind logic is able to process the sigreturn.
IMHO the kernel change was fine. 

I’m not comfortable to say the instruction sequence is the ABI that describe the unwinding information.
I guess the pattern matching was implemented due to the CFI was wrong.

The .cfi_signal_frame could be useful once the entry is found, but needs the right .CFI annotations at the right place.

This change definitely a not good for LLVM’s unwinder.

Cheers,
Daniel

[1] PowerPC:
The label (.Lsigrt_start) that is used for the CFI info include the nop: 
https://elixir.bootlin.com/linux/latest/source/arch/powerpc/kernel/vdso64/sigtramp.S#L25
https://elixir.bootlin.com/linux/latest/source/arch/powerpc/kernel/vdso64/sigtramp.S#L291

[2] X86:
https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vdso32/sigreturn.S#L13
https://elixir.bootlin.com/linux/latest/source/arch/x86/entry/vdso/vdso32/sigreturn.S#L59

> On 23 Jun 2020, at 13:08, Ard Biesheuvel <ardb at kernel.org> wrote:
> 
> On Tue, 23 Jun 2020 at 12:58, Will Deacon <will at kernel.org> wrote:
>> 
>> On Tue, Jun 23, 2020 at 12:11:24PM +0200, Ard Biesheuvel wrote:
>>> On Tue, 23 Jun 2020 at 11:54, Dave Martin <Dave.Martin at arm.com> wrote:
>>>> In the longer term, we should perhaps keep .cfi_signal_frame, and ditch
>>>> all the other annotations.  If the unwinder knows for sure it's the
>>>> signal frame (and understands this annotation), it can use its own
>>>> knowledge to do the right thing.  Otherwise, it can fall back on its
>>>> other heuristics.
>>> 
>>> What we found out is that libunwind unconditionally uses its own
>>> unwind strategy when it spots the mov/svc instruction pair at the
>>> return address, and so what we put here does not matter at all.
>>> The libgcc case is different, as it only applies the heuristics if it
>>> doesn't find any unwind info covering the return address. So this is
>>> what bit us here: by fixing the way the CFI directives are emitted,
>>> libgcc no longer uses its own baked in unwind strategy for the signal
>>> frame, but interprets the one we provide, which results in a bogus
>>> address for the /next/ frame. Since that is not covered by unwind info
>>> either, libgcc attempts to apply the heuristics at this level, and
>>> explodes because the bogus address cannot be dereferenced to get at
>>> the opcodes.
>>> 
>>> So libgcc doesn't even check whether it looks like a signal frame
>>> unless the unwind info is missing.
>>> 
>>>> This is probably better than having rich annotations that are
>>>> pedantically correct, but only half-supported by unwinders out in the
>>>> wild.
>>>> 
>>> 
>>> Agreed. But without providing any annotations, other unwinders will
>>> fail to unwind through the sigreturn trampoline, which is why the
>>> change was made in the first place.
>>> 
>>> So the safest way to fix this (imho) is to add CFI directives that
>>> mirror what libunwind does unconditionally, and libgcc when it spots
>>> the mov/svc pair on unannotated addresses, which is to restore all the
>>> registers from the sigframe, and to use regs->pc as the return address
>>> (which is where the unwind should proceed, rather than at whatever
>>> value happened to reside in x30 at the point the signal was taken)
>>> 
>>> The only hairy bit is the FP/SIMD registers, which are ignored by
>>> libunwind, and only half-restored by libgcc (which appears to assume
>>> that only d8..d15 need to be restored). This means throwing a C++
>>> exception in a signal handler and catching it outside of it is going
>>> to be difficult to get 100% correct, but this is something that
>>> appears to be broken already everywhere. (Unless there are some
>>> AAPCS/ABI rules that specify that FP/SIMD registers have to be assumed
>>> clobbered when catching an exception)
>> 
>> I think Dave makes an interesting point about .cfi_signal_frame, though.
>> With any released arm64 kernel, unwinding here requires heuristics to
>> identify the signal frame. The usual approach seems to pattern match the
>> instruction stream (and I'm not convinced libunwind works on big-endian
>> here), but you could also have looked up pc+4 to see if the function was
>> marked with .cfi_signal_frame. You can't do that anymore after this patch.
> 
> Good point.
> 
>> We have no evidence anybody bothered with that (it seems like more trouble
>> than it's worth), but it's an interesting observation. Furthermore, maybe
>> the right answer *is* to punt this all to the unwinder given that we're
>> already forced to keep the instruction sequence exactly as it is. I suspect
>> the LLVM unwinder has to use heuristics for most other architectures
>> anyway, so it should already have whatever hooks it needs.
>> 
> 
> The unwinder is definitely better equipped to deal with this, given
> how difficult it is to come up with CFI directives at compile time
> that apply universally to any kind of sigframe we may instantiate at
> runtime.
> 
>> It would be a different story if most unwinders were expecting to use
>> CFI and it generally worked well on other architectures, but that really
>> doesn't appear to be the case.
>> 
> 
> If fixing LLVM unwinding is no longer a requirement, reverting to what
> we had is probably the best, although it is rather unfortunate that we
> will be encouraging the LLVM developers to put another bodge in user
> space. Could we at least encourage them to take the libgcc approach,
> where the builtin strategy is only used as a fallback if the return
> address is not covered by unwind info? Otherwise, we are painting
> ourselves into a corner even more than we have already with respect to
> our ability to ever start doing this properly in the future.