[REGRESSION] rseq: refactoring in v6.19 broke everyone on arm64 and tcmalloc everywhere

Mathieu Desnoyers mathieu.desnoyers at efficios.com
Mon Apr 27 11:35:22 PDT 2026


On 2026-04-27 03:40, Florian Weimer wrote:
> * Thomas Gleixner:
> 
>> The real question is how to differentiate between the legacy and the
>> optimized mode. I have two working variants to achieve that:
[...]
> 
> Switching to the new extensible RSEQ allocation code in older glibc
> builds is not entirely trivial, and I would prefer not doing that.
> Registering with a new flag is comparatively simple, and we could
> backport it, except that it might not be compatible with CRIU.
A third option would allow the entire range of older libc versions to
benefit from rseq optimizations, gating the "v2" behavior on:

   rseq_len > 32 || (flags & RSEQ_FLAG_V2)

As a result:

- restore compatibility with existing tcmalloc binaries.

- glibc 2.41+ would benefit from optimization without changes.

- glibc 2.35-2.40 would be able to easily backport minimal changes [*]
   to benefit from kernel optimizations (flags & RSEQ_FLAG_V2).
   Likewise for RHEL glibc 2.34 with backported rseq support.

[*] Minimal changes to allow older libc to use the optimized mode
     involve implementing a new query for getauxval(AT_RSEQ_V2),
     which would return nonzero when the kernel supports the v2
     flag, and when supported pass a new RSEQ_FLAG_V2 flag to rseq
     on registration.

That v2 behavior would:

A) Enforce the ABI contract:

    - RO fields corruption -> kill process,

    - System call within rseq critical section -> kill process,

B) Allow optimization of the rseq field updates (only update relevant
    fields on migration),

This entirely decouples the feature enablement concern (rseq_len) from
the strictness/optimization mode (v2).

This keeps compatibility with current tcmalloc binaries because
tcmalloc always registers a 32 bytes rseq_len without the v2
flag set. tcmalloc already has its own internal fields at fixed
offsets from the rseq structure which conflict with extended rseq
fields, so limiting the tcmalloc work-around behavior to
rseq_len == 32 seem to align well with the tcmalloc project
approach towards extensibility and ecosystem inter-compatibility.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com



More information about the linux-arm-kernel mailing list