[PATCH v2] perf: arm_cspmu: Separate Arm and vendor module

Wed Apr 26 17:44:56 PDT 2023

Hi Suzuki,

> -----Original Message-----
> From: Besar Wicaksono <bwicaksono at nvidia.com>
> Sent: Wednesday, April 19, 2023 7:22 PM
> To: Suzuki K Poulose <suzuki.poulose at arm.com>; catalin.marinas at arm.com;
> will at kernel.org; mark.rutland at arm.com
> Cc: linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org; linux-
> tegra at vger.kernel.org; Thierry Reding <treding at nvidia.com>; Jonathan
> Hunter <jonathanh at nvidia.com>; Vikram Sethi <vsethi at nvidia.com>; Richard
> Wiley <rwiley at nvidia.com>; Eric Funsten <efunsten at nvidia.com>
> Subject: RE: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor module
> 
> Hi Suzuki,
> 
> > -----Original Message-----
> > From: Suzuki K Poulose <suzuki.poulose at arm.com>
> > Sent: Wednesday, April 19, 2023 4:32 PM
> > To: Besar Wicaksono <bwicaksono at nvidia.com>; catalin.marinas at arm.com;
> > will at kernel.org; mark.rutland at arm.com
> > Cc: linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org; linux-
> > tegra at vger.kernel.org; Thierry Reding <treding at nvidia.com>; Jonathan
> > Hunter <jonathanh at nvidia.com>; Vikram Sethi <vsethi at nvidia.com>;
> Richard
> > Wiley <rwiley at nvidia.com>; Eric Funsten <efunsten at nvidia.com>
> > Subject: Re: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor module
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On 18/04/2023 20:33, Besar Wicaksono wrote:
> > > Hi Suzuki,
> > >
> > >> -----Original Message-----
> > >> From: Suzuki K Poulose <suzuki.poulose at arm.com>
> > >> Sent: Tuesday, April 18, 2023 6:07 AM
> > >> To: Besar Wicaksono <bwicaksono at nvidia.com>;
> > catalin.marinas at arm.com;
> > >> will at kernel.org; mark.rutland at arm.com
> > >> Cc: linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org;
> > linux-
> > >> tegra at vger.kernel.org; Thierry Reding <treding at nvidia.com>; Jonathan
> > >> Hunter <jonathanh at nvidia.com>; Vikram Sethi <vsethi at nvidia.com>;
> > Richard
> > >> Wiley <rwiley at nvidia.com>; Eric Funsten <efunsten at nvidia.com>
> > >> Subject: Re: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor
> > module
> > >>
> > >> External email: Use caution opening links or attachments
> > >>
> > >>
> > >> On 18/04/2023 07:20, Besar Wicaksono wrote:
> > >>> Arm Coresight PMU driver consists of main standard code and vendor
> > >>> backend code. Both are currently built as a single module.
> > >>> This patch adds vendor registration API to separate the two to
> > >>> keep things modular. Vendor module shall register to the main
> > >>> module on loading and trigger device reprobe.
> > >>>
> > >>> Signed-off-by: Besar Wicaksono <bwicaksono at nvidia.com>
> > >>> ---
> > >>>
> > >>> Changes from v1:
> > >>>    * Added separate Kconfig entry for nvidia backend
> > >>>    * Added lock to protect accesses to the lists
> > >>>    * Added support for matching subset devices from a vendor
> > >>>    * Added state tracking to avoid reprobe when a device is in use
> > >>> v1: ttps://lore.kernel.org/linux-arm-kernel/20230403163905.20354-1-
> > >> bwicaksono at nvidia.com/T/#u
> > >>>
> > >>> ---
> > >>>    drivers/perf/arm_cspmu/Kconfig        |   9 +-
> > >>>    drivers/perf/arm_cspmu/Makefile       |   6 +-
> > >>>    drivers/perf/arm_cspmu/arm_cspmu.c    | 280
> > >> +++++++++++++++++++++++---
> > >>>    drivers/perf/arm_cspmu/arm_cspmu.h    |  32 ++-
> > >>>    drivers/perf/arm_cspmu/nvidia_cspmu.c |  39 +++-
> > >>>    drivers/perf/arm_cspmu/nvidia_cspmu.h |  17 --
> > >>>    6 files changed, 325 insertions(+), 58 deletions(-)
> > >>>    delete mode 100644 drivers/perf/arm_cspmu/nvidia_cspmu.h
> > >>>
> > >>> diff --git a/drivers/perf/arm_cspmu/Kconfig
> > >> b/drivers/perf/arm_cspmu/Kconfig
> > >>> index 0b316fe69a45..8ce7b45a0075 100644
> > >>> --- a/drivers/perf/arm_cspmu/Kconfig
> > >>> +++ b/drivers/perf/arm_cspmu/Kconfig
> > >>> @@ -1,6 +1,6 @@
> > >>>    # SPDX-License-Identifier: GPL-2.0
> > >>>    #
> > >>> -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> > >> reserved.
> > >>> +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All
> > rights
> > >> reserved.
> > >>>
> > >>>    config ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>>        tristate "ARM Coresight Architecture PMU"
> > >>> @@ -11,3 +11,10 @@ config
> > ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>>          based on ARM CoreSight PMU architecture. Note that this PMU
> > >>>          architecture does not have relationship with the ARM CoreSight
> > >>>          Self-Hosted Tracing.
> > >>> +
> > >>> +config NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>> +     tristate "NVIDIA Coresight Architecture PMU"
> > >>> +     depends on ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>> +     help
> > >>> +       Provides NVIDIA specific attributes for performance monitoring unit
> > >>> +       (PMU) devices based on ARM CoreSight PMU architecture.
> > >>> diff --git a/drivers/perf/arm_cspmu/Makefile
> > >> b/drivers/perf/arm_cspmu/Makefile
> > >>> index fedb17df982d..f8ae22411d59 100644
> > >>> --- a/drivers/perf/arm_cspmu/Makefile
> > >>> +++ b/drivers/perf/arm_cspmu/Makefile
> > >>> @@ -1,6 +1,6 @@
> > >>> -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> > >> reserved.
> > >>> +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All
> > rights
> > >> reserved.
> > >>>    #
> > >>>    # SPDX-License-Identifier: GPL-2.0
> > >>>
> > >>> -obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> > >> arm_cspmu_module.o
> > >>> -arm_cspmu_module-y := arm_cspmu.o nvidia_cspmu.o
> > >>> +obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> > >> arm_cspmu.o
> > >>> +obj-$(CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> > >> nvidia_cspmu.o
> > >>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c
> > >> b/drivers/perf/arm_cspmu/arm_cspmu.c
> > >>> index e31302ab7e37..c55ea2b74454 100644
> > >>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> > >>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> > >>> @@ -16,7 +16,7 @@
> > >>>     * The user should refer to the vendor technical documentation to get
> > >> details
> > >>>     * about the supported events.
> > >>>     *
> > >>> - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> > >> reserved.
> > >>> + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All
> > rights
> > >> reserved.
> > >>>     *
> > >>>     */
> > >>>
> > >>> @@ -25,13 +25,14 @@
> > >>>    #include <linux/ctype.h>
> > >>>    #include <linux/interrupt.h>
> > >>>    #include <linux/io-64-nonatomic-lo-hi.h>
> > >>> +#include <linux/list.h>
> > >>>    #include <linux/module.h>
> > >>> +#include <linux/mutex.h>
> > >>>    #include <linux/perf_event.h>
> > >>>    #include <linux/platform_device.h>
> > >>>    #include <acpi/processor.h>
> > >>>
> > >>>    #include "arm_cspmu.h"
> > >>> -#include "nvidia_cspmu.h"
> > >>>
> > >>>    #define PMUNAME "arm_cspmu"
> > >>>    #define DRVNAME "arm-cs-arch-pmu"
> > >>> @@ -117,11 +118,52 @@
> > >>>     */
> > >>>    #define HILOHI_MAX_POLL     1000
> > >>>
> > >>> -/* JEDEC-assigned JEP106 identification code */
> > >>> -#define ARM_CSPMU_IMPL_ID_NVIDIA             0x36B
> > >>> -
> > >>>    static unsigned long arm_cspmu_cpuhp_state;
> > >>>
> > >>> +/* List of Coresight PMU instances in the system. */
> > >>> +static LIST_HEAD(arm_cspmus);
> > >>> +
> > >>> +/* List of registered vendor backends. */
> > >>> +static LIST_HEAD(arm_cspmu_impls);
> > >>> +
> > >>> +static DEFINE_MUTEX(arm_cspmu_lock);
> > >>> +
> > >>> +/*
> > >>> + * State of the generic driver.
> > >>> + * 0 => registering backend.
> > >>> + * 1 => ready to use.
> > >>> + * 2 or more => in use.
> > >>> + */
> > >>> +#define ARM_CSPMU_STATE_REG  0
> > >>> +#define ARM_CSPMU_STATE_READY        1
> > >>> +static atomic_t arm_cspmu_state;
> > >>> +
> > >>> +static void arm_cspmu_state_ready(void)
> > >>> +{
> > >>> +     atomic_set(&arm_cspmu_state, ARM_CSPMU_STATE_READY);
> > >>> +}
> > >>> +
> > >>> +static bool try_arm_cspmu_state_reg(void)
> > >>> +{
> > >>> +     const int old = ARM_CSPMU_STATE_READY;
> > >>> +     const int new = ARM_CSPMU_STATE_REG;
> > >>> +
> > >>> +     return atomic_cmpxchg(&arm_cspmu_state, old, new) == old;
> > >>> +}
> > >>> +
> > >>> +static bool try_arm_cspmu_state_get(void)
> > >>> +{
> > >>> +     return atomic_inc_not_zero(&arm_cspmu_state);
> > >>> +}
> > >>> +
> > >>> +static void arm_cspmu_state_put(void)
> > >>> +{
> > >>> +     int ret;
> > >>> +
> > >>> +     ret = atomic_dec_if_positive(&arm_cspmu_state);
> > >>> +     WARN_ON(ret < 0);
> > >>> +}
> > >>> +
> > >>
> > >> As long as the vendor module is set for the PMU instance, it won't be
> > >> unloaded as long as there are any perf events and thus the specific
> > >> driver cannot be unloaded. So, you don't need explicit refcount
> > >> maintenance for each pmu callbacks.
> > >>
> > >
> > > The arm_cspmu_state mainly protects during new backend registration
> > when
> > > the device is attached to standard implementation. All devices are attached
> > to
> > > standard implementation initially when arm_cspmu module is loaded,
> since
> > there
> > > is no backend yet. On backend registration, the standard impl is replaced by
> > > backend impl. However, the module unloading mechanism doesn't provide
> > > protection because standard impl is owned by arm_cspmu module, which
> > > is not unloaded during registration.
> > >
> > > The refcount may not be required if the devices are not attached to
> standard
> > > Implementation by default, and the unreg doesn't fallback to it. But that
> > makes
> > > the devices usable only when there is a vendor backend available.
> >
> > Ok, thanks for the explanation. But I still think we :
> >
> >   - Don't need a single global refcount for all the PMUs. Instead this
> >     could be per PMU instance (arm_cspmu), only when it is backed by
> 
> Ok, we can add refcount per PMU.
> 
> >     "generic" backend, others get module refcount. If the refcount of
> >     "generic" PMU is positive, during the registration of a matching
> >     backend driver, we could simply mark that as pending reprobe.
> >
> >   - And only do the refcount for the following call backs:
> >
> >    pmu:: event_init -> hold the refcount
> >    pmu:: destroy -> drop the refcount and trigger a reprobe if one was
> >          pending (see above)
> 
> Is it safe to reprobe on destroy ? The reprobe will free and reallocate
> the memory managed by the device.

I checked the _free_event function from kernel/events/core.c. There are other
cleanups after the event->destroy is called, which may touch a stale data after a
reprobe occurs.
This would suggest reprobing right after event->destroy may not be safe, whether
it's a deferred reprobe or immediate reprobe from arm_cspmu_impl_register.

The alternative of not attaching the device to standard implementation by default
seems to be the better one so far.

Do you have other suggestion ?

> 
> Thanks,
> Besar