[REGRESSION] rseq: refactoring in v6.19 broke everyone on arm64 and tcmalloc everywhere

Thomas Gleixner tglx at linutronix.de
Fri Apr 24 07:16:08 PDT 2026


On Fri, Apr 24 2026 at 10:32, Mathias Stearn wrote:
> On Fri, Apr 24, 2026 at 9:57 AM Dmitry Vyukov <dvyukov at google.com> wrote:
>> The only problem is with membarrier (it used to force write to
>> __rseq_abi.cpu_id_start for all threads, but now it does not).
>> Otherwise the caching scheme works.
>
> I almost wrote a message last night saying that we didn't need
> cpu_id_start invalidation on preemption. However, I remembered that
> the Grow() function[1] does a load outside of a critical section then
> stores a derived value inside the critical section, guarded only by
> the cpu_id_start invalidation check in StoreCurrentCpu[2]. It really
> should be doing a compare against the original value inside the
> critical section (or just do the whole thing inside), but it doesn't.
> I haven't reasoned end-to-end through this fully to prove corruption
> is possible, but I suspect that it is if another thread same-cpu
> preempts between the loads and the store and updates the header before
> the original thread resumes and writes its original intended header
> value. Ditto for signals, which sometimes allocate even though they
> shouldn't.
>
> I was really hoping that we would only need to do the "redundant"
> cpu_id_start writes would only be needed on membarrier_rseq IPIs where
> it really is a pay-for-what-you-use functionality,

That's fine and can be solved without adding this sequence overhead into
the scheduler hotpath.

> I think existing binaries depend on invalidation on
> preemption. Luckily that should be cheap enough to be ~free.

That's only free when it can be burried in the rseq_cs update, which
means the ID update would not happen when rseq_cs is NULL.

If those two changes fix it w/o requiring additional tcmalloc changes,
I'm happy to hack that up tomorrow.

Thanks,

        tglx



More information about the linux-arm-kernel mailing list