[PATCH 05/10] entry: Split preemption from irqentry_exit_to_kernel_mode()
Mark Rutland
mark.rutland at arm.com
Wed Apr 8 03:19:25 PDT 2026
On Wed, Apr 08, 2026 at 05:17:29PM +0800, Jinjie Ruan wrote:
>
>
> On 2026/4/7 21:16, Mark Rutland wrote:
> > Some architecture-specific work needs to be performed between the state
> > management for exception entry/exit and the "real" work to handle the
> > exception. For example, arm64 needs to manipulate a number of exception
> > masking bits, with different exceptions requiring different masking.
> >
> > Generally this can all be hidden in the architecture code, but for arm64
> > the current structure of irqentry_exit_to_kernel_mode() makes this
> > particularly difficult to handle in a way that is correct, maintainable,
> > and efficient.
> >
> > The gory details are described in the thread surrounding:
> >
> > https://lore.kernel.org/lkml/acPAzdtjK5w-rNqC@J2N7QTR9R3/
> >
> > The summary is:
> >
> > * Currently, irqentry_exit_to_kernel_mode() handles both involuntary
> > preemption AND state management necessary for exception return.
> >
> > * When scheduling (including involuntary preemption), arm64 needs to
> > have all arm64-specific exceptions unmasked, though regular interrupts
> > must be masked.
> >
> > * Prior to the state management for exception return, arm64 needs to
> > mask a number of arm64-specific exceptions, and perform some work with
> > these exceptions masked (with RCU watching, etc).
> >
> > While in theory it is possible to handle this with a new arch_*() hook
> > called somewhere under irqentry_exit_to_kernel_mode(), this is fragile
> > and complicated, and doesn't match the flow used for exception return to
> > user mode, which has a separate 'prepare' step (where preemption can
> > occur) prior to the state management.
> >
> > To solve this, refactor irqentry_exit_to_kernel_mode() to match the
> > style of {irqentry,syscall}_exit_to_user_mode(), moving preemption logic
> > into a new irqentry_exit_to_kernel_mode_preempt() function, and moving
> > state management in a new irqentry_exit_to_kernel_mode_after_preempt()
> > function. The existing irqentry_exit_to_kernel_mode() is left as a
> > caller of both of these, avoiding the need to modify existing callers.
> >
> > There should be no functional change as a result of this patch.
> >
> > Signed-off-by: Mark Rutland <mark.rutland at arm.com>
> > Cc: Andy Lutomirski <luto at kernel.org>
> > Cc: Catalin Marinas <catalin.marinas at arm.com>
> > Cc: Jinjie Ruan <ruanjinjie at huawei.com>
> > Cc: Peter Zijlstra <peterz at infradead.org>
> > Cc: Thomas Gleixner <tglx at kernel.org>
> > Cc: Vladimir Murzin <vladimir.murzin at arm.com>
> > Cc: Will Deacon <will at kernel.org>
> > ---
> > include/linux/irq-entry-common.h | 26 +++++++++++++++++++++-----
> > 1 file changed, 21 insertions(+), 5 deletions(-)
> >
> > Thomas/Peter/Andy, as mentioned on IRC, I haven't created kerneldoc
> > comments for these new functions because the existing comments don't
> > seem all that consistent (e.g. for user mode vs kernel mode), and I
> > suspect we want to rewrite them all in one go for wider consistency.
> >
> > I'm happy to respin this, or to follow-up with that as per your
> > preference.
> >
> > Mark.
> >
> > diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
> > index 2206150e526d8..24830baa539c6 100644
> > --- a/include/linux/irq-entry-common.h
> > +++ b/include/linux/irq-entry-common.h
> > @@ -421,10 +421,18 @@ static __always_inline irqentry_state_t irqentry_enter_from_kernel_mode(struct p
> > return ret;
> > }
> >
> > -static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, irqentry_state_t state)
> > +static inline void irqentry_exit_to_kernel_mode_preempt(struct pt_regs *regs, irqentry_state_t state)
> > {
> > - lockdep_assert_irqs_disabled();
> > + if (regs_irqs_disabled(regs) || state.exit_rcu)
> > + return;
> > +
> > + if (IS_ENABLED(CONFIG_PREEMPTION))
> > + irqentry_exit_cond_resched();
> > +}
> >
> > +static __always_inline void
> > +irqentry_exit_to_kernel_mode_after_preempt(struct pt_regs *regs, irqentry_state_t state)
> > +{
> > if (!regs_irqs_disabled(regs)) {
> > /*
> > * If RCU was not watching on entry this needs to be done
> > @@ -443,9 +451,6 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, i
> > }
> >
> > instrumentation_begin();
> > - if (IS_ENABLED(CONFIG_PREEMPTION))
> > - irqentry_exit_cond_resched();
> > -
> > /* Covers both tracing and lockdep */
> > trace_hardirqs_on();
> > instrumentation_end();
> > @@ -459,6 +464,17 @@ static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, i
> > }
> > }
> >
> > +static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *regs, irqentry_state_t state)
> > +{
> > + lockdep_assert_irqs_disabled();
> > +
> > + instrumentation_begin();
> > + irqentry_exit_to_kernel_mode_preempt(regs, state);
> > + instrumentation_end();
>
> I think the below AI's feedback makes sense. Directly calling
> irqentry_exit_to_kernel_mode_preempt() on arm64/other archs could lead
> to missing instrumentation_begin()/end() markers.
>
> https://sashiko.dev/#/patchset/20260407131650.3813777-1-mark.rutland%40arm.com
I deliberartely made irqentry_exit_to_kernel_mode_preempt 'inline'
rather than '__always_inline' since everything it does is
instrumentable, and it's up to architecture code to handle that
appropriately.
On arm64 instrumentation_begin() and instrumentation_end() are currently
irrelevant. I didn't add those in the arm64-specific entry code as
they'd simply add pointless NOPs.
This is fine as-is.
Mark.
More information about the linux-arm-kernel
mailing list