[REGRESSION] rseq: refactoring in v6.19 broke everyone on arm64 and tcmalloc everywhere

Mark Rutland mark.rutland at arm.com
Wed Apr 22 11:11:42 PDT 2026


On Wed, Apr 22, 2026 at 07:49:30PM +0200, Thomas Gleixner wrote:
> On Wed, Apr 22 2026 at 14:09, Mark Rutland wrote:
> > On Wed, Apr 22, 2026 at 11:50:26AM +0200, Mathias Stearn wrote:
> >> TL;DR: As of 6.19, rseq no longer provides the documented atomicity
> >> guarantees on arm64 by failing to abort the critical section on same-core
> >> preemption/resumption. Additionally, it breaks tcmalloc specifically by
> >> failing to overwrite the cpu_id_start field at points where it was relied
> >> on for correctness.
> >
> > Thanks for the report, and the test case.
> >
> > As a holding reply, I'm looking into this now from the arm64 side.
> 
> I assume it's the partial conversion to the generic entry code which
> screws that up. 

It's slightly more than that, but in a sense, yes. ;)

The fix is conceptually simple, but I'll need to do some refactoring.

Conceptually we just need to use syscall_enter_from_user_mode() and
irqentry_enter_from_user_mode() appropriately.

In practice, I can't use those as-is without introducing the exception
masking problems I just fixed up for irqentry_enter_from_kernel_mode(),
so I'll need to do some similar refactoring first.

That and I *think* a couple of of the current checks for CONFIG_GENERIC_ENTRY
should be checking CONFIG_GENERIC_IRQ_ENTRY, since all of the relevant
bits are in the generic irqentry code rather than the GENERIC_SYSCALL
code (and GENERIC_ENTRY is just GENERIC_IRQ_ENTRY + GENERIC_SYSCALL).

> The problem reproduces with rseq selftests nicely.

Ah; that's both good to know, and worrying that we've never had a report
from all the automated testing people are supposedly running. :/

> The patch below fixes it as it puts ARM64 back to the non-optimized code
> for now. Once ARM64 is fully converted it gets all the nice improvements.

Thanks; I'll give that a test tomorrow.

I haven't paged everything in yet, so just to cehck, is there anything
that would behave incorrectly if current->rseq.event.user_irq were set
for syscall entry? IIUC it means we'll effectively do the slow path, and
I was wondering if that might be acceptable as a one-line bodge for
stable.

As above, I'd like if the actual fix could make this work for
GENERIC_IRQ_ENTRY rather than GENERIC_ENTRY, since that way we can make
this work as it was supposed to *before* moving to GENERIC_SYSCALL
(which has a whole lot more ABI impact to worry about).

I think that just needs a small amount of refactoring that arm64 will
need regardless.

Mark.

> 
> Thanks,
> 
>         tglx
> ---
> diff --git a/include/linux/rseq.h b/include/linux/rseq.h
> index 2266f4dc77b6..d55476e2a336 100644
> --- a/include/linux/rseq.h
> +++ b/include/linux/rseq.h
> @@ -30,7 +30,7 @@ void __rseq_signal_deliver(int sig, struct pt_regs *regs);
>   */
>  static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs)
>  {
> -	if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> +	if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
>  		/* '&' is intentional to spare one conditional branch */
>  		if (current->rseq.event.has_rseq & current->rseq.event.user_irq)
>  			__rseq_signal_deliver(ksig->sig, regs);
> @@ -50,7 +50,7 @@ static __always_inline void rseq_sched_switch_event(struct task_struct *t)
>  {
>  	struct rseq_event *ev = &t->rseq.event;
>  
> -	if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> +	if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
>  		/*
>  		 * Avoid a boat load of conditionals by using simple logic
>  		 * to determine whether NOTIFY_RESUME needs to be raised.
> diff --git a/include/linux/rseq_entry.h b/include/linux/rseq_entry.h
> index a36b472627de..8ccd464a108d 100644
> --- a/include/linux/rseq_entry.h
> +++ b/include/linux/rseq_entry.h
> @@ -80,7 +80,7 @@ bool rseq_debug_validate_ids(struct task_struct *t);
>  
>  static __always_inline void rseq_note_user_irq_entry(void)
>  {
> -	if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY))
> +	if (IS_ENABLED(CONFIG_GENERIC_ENTRY))
>  		current->rseq.event.user_irq = true;
>  }
>  
> @@ -171,8 +171,8 @@ bool rseq_debug_update_user_cs(struct task_struct *t, struct pt_regs *regs,
>  		if (unlikely(usig != t->rseq.sig))
>  			goto die;
>  
> -		/* rseq_event.user_irq is only valid if CONFIG_GENERIC_IRQ_ENTRY=y */
> -		if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> +		/* rseq_event.user_irq is only valid if CONFIG_GENERIC_ENTRY=y */
> +		if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
>  			/* If not in interrupt from user context, let it die */
>  			if (unlikely(!t->rseq.event.user_irq))
>  				goto die;
> @@ -387,7 +387,7 @@ static rseq_inline bool rseq_update_usr(struct task_struct *t, struct pt_regs *r
>  	 * allows to skip the critical section when the entry was not from
>  	 * a user space interrupt, unless debug mode is enabled.
>  	 */
> -	if (IS_ENABLED(CONFIG_GENERIC_IRQ_ENTRY)) {
> +	if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
>  		if (!static_branch_unlikely(&rseq_debug_enabled)) {
>  			if (likely(!t->rseq.event.user_irq))
>  				return true;



More information about the linux-arm-kernel mailing list