[REGRESSION] rseq: refactoring in v6.19 broke everyone on arm64 and tcmalloc everywhere

Thomas Gleixner tglx at linutronix.de
Thu Apr 23 04:48:12 PDT 2026


On Thu, Apr 23 2026 at 11:24, Mathias Stearn wrote:
> On Wed, Apr 22, 2026 at 3:13 PM Peter Zijlstra <peterz at infradead.org> wrote:
> To make this more concrete, I am proposing adding
>
> unsafe_put_user((u32)task_cpu(t), &t->rseq.usrptr->cpu_id_start, efault);
>
> after each place where you currently do
>
> unsafe_put_user(0ULL, &t->rseq.usrptr->rseq_cs, efault);
>
> in rseq_update_user_cs. Is that something that you would expect to cause a
> performance issue?

That would work and not bring the performance issues back, but:

  1) Did you validate that adding the reset into rseq_update_user_cs() is
     actually sufficient?

     If adding it to rseq_update_user_cs() is not sufficient, then we
     have a really serious problem. Because we'd need to go back and do
     it unconditionally, which then makes the 15% performance
     regression, which happened when glibc enabled rseq, come back
     instantaneously. And in that case the damage for tcmalloc() is the
     lesser of two evils.

  2) The tcmalloc abuse breaks the documented and guaranteed user space
     ABI and therefore it makes it impossible for any other library in
     an application which uses tcmalloc to rely on the documented and
     guaranteed rseq::cpu_id_start/rseq::cpu_id semantics.

     Which means, that tcmalloc is holding everybody else hostage.
     That's just not acceptable. Not even under the no regression rule.

  3) The fact that tcmalloc prevents a user from enabling rseq debugging
     is equally unacceptable as it does not allow me to validate my own
     rseq magic code in my mongodb client because enabling it will make
     the DB I want to test against go away.

     Again tcmalloc holds everybody else hostage for no reason at all.

The most amazing part is that tcmalloc uses this to spare two
instruction cycles, but nobody noticed in 8 years how much performance
the unconditional rseq nonsense in the kernel left on the table.

Thanks,

        tglx



More information about the linux-arm-kernel mailing list