[PATCH] arm64: Implement prctl(PR_{G,S}ET_TSC)

Mon Apr 29 01:35:56 PDT 2024

First a note that this has been previously discussed on this list for
the same motivation and with the same questions asked, so let
me link that first:

Link: https://lore.kernel.org/linux-arm-kernel/CAP045ApiMSvP--f2E0=VdMbjE8oibvy921m8JASf4kaCCuU2RA@mail.gmail.com/T/

> It seems to me that this sort of "trap and inspect" behaviour is in
> the realm of ptrace(), and not that of prctl(), because I can't
> imagine the debugged program calling that by itself.

In the rr use case, this is indeed used to force a ptrace signal stop
for trap/inspect (or emulate in replay mode). Of course the history
of PR_SET_TSC is complicated and was not originally intended for
this use case. Rather, it was intended as a hardening mechanism
against cache-timing/speculative execution attacks (this was
pre-Spectre, so all the discussion at the time is a bit theoretical).
Of course, the Spectre experience has shown that you don't really
actually need architectural timers for any of this, so perhaps it wasn't
all that useful in the first place.

I think this could reasonably be made a PTRACE_EVENT, but that would
not be useable for hardening if somebody wanted to do that in the future
(I am not personally aware of any such request, but it doesn't seem
unreasonable). Of course, my concerns about adding yet another kind
of ptrace stop mentioned last time still exist.

Perhaps a third option would be to put this into seccomp instead. One
could imagine a flag addition to SECCOMP_SET_MODE_FILTER that
would enable filtering for all otherwise emulated instructions (not just
a CNTVCT_EL0 read) and an appropriate bpf program to decide
allow/disallow/trap/signal/whatever. Of course, there are folks already
unhappy with using seccomp for non-security purposes (as rr already
does), so there may be some objection there.

Similar concerns about making ptrace stops more complicated apply
(basically the issue is that what you can do to a tracee at a
ptrace stop very much depends on where exactly in the kernel the
tracee is stopped, which ptrace doesn't always tell you,
so to be fully correct, ptracers need to do some very elaborate
state machine tracking).

> How does rr work on non-x86 architectures? How does this interoperate
> with AArch32?

rr has significant microarchitecture dependence. It works on most modern x86
chips (AMD worse than Intel with the exception of some Intel Atom uarchs)
and more recent high-end AArch64 chips as long as the programs only use
lse atomics and not llsc. AArch32 is not supported. See here for a list:

https://github.com/rr-debugger/rr/blob/822863c92d8f52778700f0375ac1b705193a2152/src/PerfCounters.cc#L198-L237

Thanks,
Keno