[PATCHv2 06/11] arm64: entry: move el1 irq/nmi logic to C

Mark Rutland mark.rutland at arm.com
Thu May 6 02:16:37 PDT 2021


On Thu, May 06, 2021 at 04:28:09PM +0800, He Ying wrote:
> Hi Mark,

Hi,

> I have faced a performance regression for handling IPIs since this commit.
> 
> I caculate the cycles from the entry of el1_irq to the entry of
> gic_handle_irq.
> 
> From my test, this commit may overhead an average of 200 cycles. Do you
> 
> have any ideas about this? Looking forward to your reply.

On that path, the only meaningfull difference is the call to
enter_el1_irq_or_nmi(), since that's now unconditional, and it's an
extra layer in the callchain.

When either CONFIG_ARM64_PSEUDO_NMI or CONFIG_TRACE_IRQFLAGS are
selected, enter_el1_irq_or_nmi() is a wrapper for functions we'd already
call, and I'd expectthe cost of the callees to dominate.

When neither CONFIG_ARM64_PSEUDO_NMI nor CONFIG_TRACE_IRQFLAGS are
selected, this should add a trivial function that immediately returns,
and so 200 cycles seems excessive.

Building that commit with defconfig, I see that GCC 10.1.0 generates:

| ffff800010dfc864 <enter_el1_irq_or_nmi>:
| ffff800010dfc864:       d503233f        paciasp
| ffff800010dfc868:       d50323bf        autiasp
| ffff800010dfc86c:       d65f03c0        ret

... so perhaps the PACIASP and AUTIASP have an impact?

I have a few questions:

* Which CPU do you see this on?

* Does that CPU implement pointer authentication?

* What kernel config are you using? e.g. is this seen with defconfig?

* What's the total cycle count from el1_irq to gic_handle_irq?

* Does this measurably impact a real workload?

Thanks,
Mark.



More information about the linux-arm-kernel mailing list