[PATCH] arm64: Apply dynamic shadow call stack patching in two passes

Thu Jan 26 11:35:41 PST 2023

On Thu, 26 Jan 2023 at 20:08, Linus Torvalds
<torvalds at linux-foundation.org> wrote:
>
> On Tue, Dec 13, 2022 at 6:29 AM Ard Biesheuvel <ardb at kernel.org> wrote:
> >
> > Due to past bad experiences with the highly complex and overengineered
> > DWARF standard that describes the unwind metadata that we are using to
> > locate these instructions [..]
>
> Just a note on why I distrust DWARF data so much - it's not so much
> because it's complex and overengineered (although I agree it is),
> because that's not anything new.
>
> It's because it's almost entirely *untested*.
>
> Compiler code generation bugs are a real issue, and happen
> semi-regularly. Are they _common_? No. But they are an issue, and we
> were chasing one just a couple of weeks ago.
>
> But code generation bugs are things that get very fundamentally
> tested. When the compiler generates bad code, every single user of
> that compiler will effectively be testing it.
>
> Yes, we still hit them, often because the kernel does something
> unusual (ie the last one was apparently only triggered with a
> combination of sanitizer and coverage flags), so "test coverage" isn't
> any kind of guarantee, but it's there.
>
> But DWARF debug info? It can be *completely* wrong, and in 99.9% of
> all cases nobody will ever notice in any testing. Most of the time
> it's not used at all, and even when it is used (whether exception
> handling or for actually doing debuggers) it's used only for a tiny
> tiny percentage of the whole thing.
>
> So it's not just that we've had bad experiences with it in the past. I
> feel that the problem goes deeper than that - the lack of testing
> means that it's fundamentally not trustworthy.
>
> Am I exaggerating a bit? Sure. Compilers have (extensive) test-suites
> for debug info too. But I do think that coverage tends to be much less
> than "everybody relies on it being right" like for normal code
> generation.
>
> End result: I would love for us to have some additional security nets
> in this area.
>
> Doing the checks as a dry-run phase is good so that any possible
> issues hopefully get caught before the code actively rewrites things,
> but I'd still be even happier if this was a build-time thing and part
> of objtool or something.
>
> That way the dwarf info would also be validated even when it's not
> actively used - which is a large point about my "this has seldom been
> tested" issue with it.
>
> Because I *think* this dry-run thing is only run of the (few) arm64
> cores that actually have PACIASP/AUTIASP. No?
>

No, the other way around. On cores where PACIASP/AUTIASP execute as
NOPs, they are replaced with shadow call stack pushes and pops. This
is preferred over having both shadow call stack and PAC in the same
[generic] kernel image, as the performance hit of the shadow call
stack is not justified on cores that implement PAC.