[PATCHv2 06/11] arm64: entry: move el1 irq/nmi logic to C
Mark Rutland
mark.rutland at arm.com
Thu May 6 02:16:37 PDT 2021
On Thu, May 06, 2021 at 04:28:09PM +0800, He Ying wrote:
> Hi Mark,
Hi,
> I have faced a performance regression for handling IPIs since this commit.
>
> I caculate the cycles from the entry of el1_irq to the entry of
> gic_handle_irq.
>
> From my test, this commit may overhead an average of 200 cycles. Do you
>
> have any ideas about this? Looking forward to your reply.
On that path, the only meaningfull difference is the call to
enter_el1_irq_or_nmi(), since that's now unconditional, and it's an
extra layer in the callchain.
When either CONFIG_ARM64_PSEUDO_NMI or CONFIG_TRACE_IRQFLAGS are
selected, enter_el1_irq_or_nmi() is a wrapper for functions we'd already
call, and I'd expectthe cost of the callees to dominate.
When neither CONFIG_ARM64_PSEUDO_NMI nor CONFIG_TRACE_IRQFLAGS are
selected, this should add a trivial function that immediately returns,
and so 200 cycles seems excessive.
Building that commit with defconfig, I see that GCC 10.1.0 generates:
| ffff800010dfc864 <enter_el1_irq_or_nmi>:
| ffff800010dfc864: d503233f paciasp
| ffff800010dfc868: d50323bf autiasp
| ffff800010dfc86c: d65f03c0 ret
... so perhaps the PACIASP and AUTIASP have an impact?
I have a few questions:
* Which CPU do you see this on?
* Does that CPU implement pointer authentication?
* What kernel config are you using? e.g. is this seen with defconfig?
* What's the total cycle count from el1_irq to gic_handle_irq?
* Does this measurably impact a real workload?
Thanks,
Mark.
More information about the linux-arm-kernel
mailing list