[PATCH 0/3] arm_pmu: acpi: avoid allocations in atomic context

Mark Rutland mark.rutland at arm.com
Fri Sep 30 04:18:41 PDT 2022


This series attempts to make the arm_pmu ACPI probing code lpay nicely
with PREEMPT_RT by moving work out of atomic context.

The arm_pmu ACPI probing code tries to do a number of things in atomic
context which is generally not good, and especially problematic for
PREEMPT_RT, as reported by Valentin and Pierre:

  https://lore.kernel.org/all/20210810134127.1394269-2-valentin.schneider@arm.com/
  https://lore.kernel.org/linux-arm-kernel/20220912155105.1443303-1-pierre.gondois@arm.com/

We'd previously tried to bodge around this, e.g. in commits:

* 0dc1a1851af1d593 ("arm_pmu: add armpmu_alloc_atomic()")
* 167e61438da0664c ("arm_pmu: acpi: request IRQs up-front")

... but this isn't good enough for PREEMPT_RT, and as reported by Pierre
the probing code can trigger splats:

| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
| in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 24, name: cpuhp/0
| preempt_count: 0, expected: 0
| RCU nest depth: 0, expected: 0
| 3 locks held by cpuhp/0/24:
| #0: ffffd8a22c8870d0 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun (linux/kernel/cpu.c:754)
| #1: ffffd8a22c887120 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun (linux/kernel/cpu.c:754)
| #2: ffff083e7f0d97b8 ((&c->lock)){+.+.}-{3:3}, at: ___slab_alloc (linux/mm/slub.c:2954)
| irq event stamp: 42
| hardirqs last enabled at (41): finish_task_switch (linux/./arch/arm64/include/asm/irqflags.h:35)
| hardirqs last disabled at (42): cpuhp_thread_fun (linux/kernel/cpu.c:776 (discriminator 1))
| softirqs last enabled at (0): copy_process (linux/./include/linux/lockdep.h:191)
| softirqs last disabled at (0): 0x0
| CPU: 0 PID: 24 Comm: cpuhp/0 Tainted: G        W         5.19.0-rc3-rt4-custom-piegon01-rt_0 #142
| Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18
| Call trace:
| dump_backtrace (linux/arch/arm64/kernel/stacktrace.c:200)
| show_stack (linux/arch/arm64/kernel/stacktrace.c:207)
| dump_stack_lvl (linux/lib/dump_stack.c:107)
| dump_stack (linux/lib/dump_stack.c:114)
| __might_resched (linux/kernel/sched/core.c:9929)
| rt_spin_lock (linux/kernel/locking/rtmutex.c:1732 (discriminator 4))
| ___slab_alloc (linux/mm/slub.c:2954)
| __slab_alloc.isra.0 (linux/mm/slub.c:3116)
| kmem_cache_alloc_trace (linux/mm/slub.c:3207)
| __armpmu_alloc (linux/./include/linux/slab.h:600)
| armpmu_alloc_atomic (linux/drivers/perf/arm_pmu.c:927)
| arm_pmu_acpi_cpu_starting (linux/drivers/perf/arm_pmu_acpi.c:204)
| cpuhp_invoke_callback (linux/kernel/cpu.c:192)
| cpuhp_thread_fun (linux/kernel/cpu.c:777 (discriminator 3))
| smpboot_thread_fn (linux/kernel/smpboot.c:164 (discriminator 3))
| kthread (linux/kernel/kthread.c:376)
| ret_from_fork (linux/arch/arm64/kernel/entry.S:868)

Thomas Gleixner suggested that we could pre-allocate structures to avoid
this issue:

  https://lore.kernel.org/all/87y299oyyq.ffs@tglx/

... and Pierre implemented that:

  https://lore.kernel.org/linux-arm-kernel/20220912155105.1443303-1-pierre.gondois@arm.com/

... but in practice this gets pretty hairy due to having to manage the
lifetime of those pre-allocated objects across various stages of the
probing flow.

This series reworks the code to perform all the allocation and
registration with perf at boot time, by scannign the set of online CPUs
and regsiter a PMU for each unique MIDR (which we use today to identify
distinct PMUs). This avoids the need for allocation in the hotplug
paths, and brings the ACPI probing code into line with the DT/platform
probing code.

When a CPU is late hotplugged, either:

(a) It matches an existing PMU's MIDR, and will be associated with that
    PMU.

(b) It does not match an existing PMU's MIDR, and will not be
    associated with a PMU (and a warning is logged to dmesg).

    Aside from the warning, this matches the existing behaviour, as we
    only register CPU PMUs with perf at boot time, and not for late
    hotplugged CPUs.

I've tested the series in a VM, using ACPI and faked MIDR values to test
a few homogeneous and heterogeneous configurations, using the 'maxcpus'
kernel argument to test the late-hotplug behaviour:

* On a system where all CPUs have the same MIDR, late-onlining a CPU causes it
  to be associated with a matching PMU:

  | # ls /sys/bus/event_source/devices/
  | armv8_pmuv3_0  breakpoint     software       tracepoint
  | # cat /sys/bus/event_source/devices/armv8_pmuv3_0/cpus 
  | 0-7
  | # echo 1 > /sys/devices/system/cpu/cpu10/online 
  | Detected PIPT I-cache on CPU10
  | GICv3: CPU10: found redistributor a region 0:0x00000000081e0000
  | GICv3: CPU10: using allocated LPI pending table @0x00000000402b0000
  | CPU10: Booted secondary processor 0x000000000a [0x431f0af1]
  | # ls /sys/bus/event_source/devices/
  | armv8_pmuv3_0  breakpoint     software       tracepoint
  | # cat /sys/bus/event_source/devices/armv8_pmuv3_0/cpus 
  | 0-7,10

* On a system where all CPUs have a unique MIDR, each of the boot-time
  CPUs gets a unique PMU:

  | # ls /sys/bus/event_source/devices/
  | armv8_pmuv3_0  armv8_pmuv3_3  armv8_pmuv3_6  software
  | armv8_pmuv3_1  armv8_pmuv3_4  armv8_pmuv3_7  tracepoint
  | armv8_pmuv3_2  armv8_pmuv3_5  breakpoint

* On a system where all CPUs have a unique MIDR, late-onlining a CPU
  results in that CPU not being associated with a PMU, but the CPU is
  successfully onlined:

  | # echo 1 > /sys/devices/system/cpu/cpu8/online
  | Detected PIPT I-cache on CPU8
  | GICv3: CPU8: found redistributor 8 region 0:0x00000000081a0000
  | GICv3: CPU8: using allocated LPI pending table @0x0000000040290000
  | Unable to associate CPU8 with a PMU
  | CPU8: Booted secondary processor 0x0000000008 [0x431f0af1]

Thanks,
Mark.

Mark Rutland (3):
  arm_pmu: acpi: factor out PMU<->CPU association
  arm_pmu: factor out PMU matching
  arm_pmu: rework ACPI probing

 drivers/perf/arm_pmu.c       |  17 +-----
 drivers/perf/arm_pmu_acpi.c  | 113 ++++++++++++++++++++---------------
 include/linux/perf/arm_pmu.h |   1 -
 3 files changed, 69 insertions(+), 62 deletions(-)

-- 
2.30.2




More information about the linux-arm-kernel mailing list