[PATCHv3 1/4] arm64: entry: judge nmi ealier to avoid deadlock in RCU

Mark Rutland mark.rutland at arm.com
Fri Nov 19 06:04:04 PST 2021


On Fri, Nov 19, 2021 at 10:01:00AM +0800, Pingfan Liu wrote:
> On Wed, Nov 17, 2021 at 11:38:54AM +0000, Mark Rutland wrote:
> > On Tue, Nov 16, 2021 at 04:24:47PM +0800, Pingfan Liu wrote:
> > > Linux kernel places strict semantics between NMI and maskable interrupt.
> > > So does the RCU component, else deadlock may happen. But the current
> > > arm64 entry code can partially breach this rule through calling
> > > rcu_nmi_enter().
> > > 
> > > ***  how a deadlock can happen if NMI mistaken as IRQ  ***
> > > 
> > >     rcu_nmi_enter()
> > >     {
> > > 
> > >         if (rcu_dynticks_curr_cpu_in_eqs()) {
> > > 
> > >         	if (!in_nmi())
> > >         		rcu_dynticks_task_exit();
> > >                 ...
> > >         	if (!in_nmi()) {
> > >         		instrumentation_begin();
> > >         		rcu_cleanup_after_idle();
> > >         		instrumentation_end();
> > >         	}
> > >                 ...
> > >         } else if (!in_nmi()) {
> > >         	instrumentation_begin();
> > >         	rcu_irq_enter_check_tick();
> > >         }
> > > 
> > >     }
> > > 
> > > If a NMI is mistaken as a maskable interrupt, rcu_irq_enter_check_tick()
> > > can hit a deadlock, which is demonstrated by the following scenario:
> > > 
> > >     note_gp_changes() runs in a task context
> > >     {
> > >        local_irq_save(flags); // this protects against irq, but not NMI
> > 
> > Note: speecifically, this masks IRQs via ICC_PMR_EL1.
> > 
> > >        rnp = rdp->mynode;
> > >        ...
> > >        raw_spin_trylock_rcu_node(rnp)
> > >                                   -------> broken in by (p)NMI, without taking __nmi_enter()
> > 
> > AFAICT the broken case described here *cannot* happen.
> > 
> > If we take a pNMI here, we'll go:
> > 
> >   el1h_64_irq()
> >   -> el1h64_irq_handler()
> >   -> el1_interrupt()
> > 
> > ... where el1_interrupt is:
> > 
> > | static void noinstr el1_interrupt(struct pt_regs *regs,
> > |                                   void (*handler)(struct pt_regs *)) 
> > | {
> > |         write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
> > | 
> > |         if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs))
> > |                 __el1_pnmi(regs, handler);
> > |         else
> > |                 __el1_irq(regs, handler);
> > | } 
> > 
> > ... and interrupts_enabled() is:
> > 
> > | #define interrupts_enabled(regs)                        \
> > |        (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs))
> > 
> > ... and irqs_priority_unmasked is:
> > 
> > | #define irqs_priority_unmasked(regs)                                    \
> > |         (system_uses_irq_prio_masking() ?                               \
> > |                 (regs)->pmr_save == GIC_PRIO_IRQON :                    \
> > |                 true)
> > 
> > ... irqs_priority_unmasked(regs) *must* return false for this case,
> > since the local_irq_save() above must have set the PMR to
> > GIC_PRIO_IRQOFF in the interrupted context.
> > 
> > Thus, we *must* call __el1_pnmi() in this case, where __el1_pnmi() is:
> > 
> 
> Yes, you are right. And I made a mistake to think it will call
> __el1_irq()

> As above, I made a mistake about the condition hence the code path.
> There is no problem in this case through calling __el1_pnmi()
> 
> Sorry for the false alarm and thank you again for your detailed
> explaination.

No worries; thanks for confirming this was a false alarm. Glad we're now
on the same page. :)

Likewise, thanks for reporting what you thought was an issue; having
more eyes on this is always good!

Thanks,
Mark.



More information about the linux-arm-kernel mailing list