[PATCH 0/6] Fix unwinding through sigreturn trampolines

Szabolcs Nagy szabolcs.nagy at arm.com
Tue Jun 23 09:56:14 EDT 2020

The 06/23/2020 14:20, Will Deacon wrote:
> Hi Szabolcs,
> Cheers for the reply.
> On Tue, Jun 23, 2020 at 12:11:09PM +0100, Szabolcs Nagy wrote:
> > as for thread cancellation in glibc: it uses exception
> > mechanism for cleanups, but the default cancel state
> > is PTHREAD_CANCEL_DEFERRED which means only blocking
> > libc calls throw (so -fexceptions is enough and the
> > libgcc logic is fine), if you switch to
> > PTHREAD_CANCEL_ASYNCHRONOUS then there may be a problem
> > but you can only do pure computations in that state,
> > (only 3 libc functions are defined to be async cancel
> > safe), i think you cannot register cleanup handlers
> > that run on the same stack frame that may be async
> > interrupted.
> Ah, I was trying to print a message, so I suppose that's out. Even so,
> debugging with gdb and putting a breakpoint on the callback showed that
> it wasn't getting invoked.
> My code is below just as an FYI, since being able to derive a test from
> this would be useful should we try to fix the CFI directives in future.
> I get different results based on different combinations of
> architecture, toolchain and optimisation level.

with -fexceptions gcc only emits the cleanup begin/end
labels around function calls, i.e. it only expects a throw
from functions (the cleanup handler is called if the pc is
between the begin/end labels during unwind), if an
instruction is interrupted and you throw from there then
cleanup may work if the instruction happens to be in the
range covered by the begin/end labels, but gcc does not
try to make that happen.

with -fnon-call-exceptions i think the test is supposed
to work and here it works, i get:

Cleanup handler called 0x2
Cleanup handler called 0x1

i think posix does not allow pthread_cleanup_push in
async cancel state (but you can change the cancel
state before and after it, which is valid i think),
i think printf is valid in the cleanup handler:
the cancel state is reset (and cancellation is disabled)
when libc acts on cancellation. (and if the interrupted
code was async cancel safe it should work).

(that said i've seen issues with -fnon-call-exceptions
so i consider the musl cancellation design more robust:
just add the cleanup ptr to a libc internal list that
is called on cancellation, no unwinding is involved.
this does not work with c++ dtors though, but c++ never
defined dtor vs posix cancellation semantics so
cancelling c++ code is just undefined.)

i think unwinding from arbitrary instruction should
work on aarch64 when c/c++ code is interrupted, but
exception handlers don't need to.

More information about the linux-arm-kernel mailing list