[musl] Re: [PATCH v8 00/38] arm64/gcs: Provide support for GCS in userspace

Tue Feb 20 18:11:23 PST 2024

On Tue, 2024-02-20 at 20:27 -0500, dalias at libc.org wrote:
> > Then I think WRSS might fit your requirements better than what
> > glibc
> > did. It was considered a reduced security mode that made libc's job
> > much easier and had better compatibility, but the last discussion
> > was
> > to try to do it without WRSS.
> 
> Where can I read more about this? Some searches I tried didn't turn
> up
> much useful information.

There never was any proposal written down AFAIK. In the past we have
had a couple "shadow stack meetup" calls where folks who are working on
shadow stack got together to hash out some things. We discussed it
there.

But briefly, in the Intel SDM (and other places) there is documentation
on the special shadow stack instructions. The two key ones for this are
WRSS and RSTORSSP. WRSS is an instruction which can be enabled by the
kernel (and there is upstream support for this). The instruction can
write through shadow stack memory.

RSTORSSP can be used to consume a restore token, which is a special
value on the shadow stack. When this operations happens the SSP is
moved adjacent to the token that was just consumed. So between the two
of them the SSP can be adjusted to specific spots on the shadow stack
or another shadow stack.

Today when you longjmp() with shadow stack in glibc, INCSSP is used to
move the SSP back to the spot on the shadow stack where the setjmp()
was called. But this algorithm doesn't always work, for example,
longjmp()ing between stacks. To work around this glibc uses a scheme
where it searches from the target SSP for a shadow stack token and then
consumes it and INCSSPs back to the target SPP. It just barely
miraculously worked in most cases.

Some specific cases that were still open were longjmp()ing off of a
custom userspace threading library stack, which may not have left a
token behind when it jumped to a new stack. And also, potentially off
of an alt shadow stack in the future, depending on whether it leaves a
restore token when handling a signal. (the problem there, is if there
is no room to leave it).

So that is how x86 glic works, and I think arm was thinking along the
same lines. But if you have WRSS (and arm's version), you could just
write a restore token or anything else you need to fixup on the shadow
stack. Then you could longjmp() in one go without any high wire acts.
It's much simpler and more robust and would prevent needing to leave a
restore token when handling a signal to an alt shadow stack. Although,
nothing was ever prototyped. So "in theory".

But that is all about moving the SSP where you need it. It doesn't
resolve any of the allocation lifecycle issues. I think for those the
solutions are:
1. Not supporting ucontext/sigaltstack and shadow stack
2. Stefan's idea
3. A new interface that takes user allocated shadow stacks for those
operations

My preference has been a combination of 1 and 3. For threads, I think
Mark's clone3 enhancements will help.

Anyway, there is an attempt at a summary. I'd also point you to HJ for
more glibc context, as I mostly worked on the kernel side.