[PATCH 0/4] KVM: arm64: PMU: Fix PMUVer handling on heterogeneous PMU systems
Marc Zyngier
maz at kernel.org
Fri Jun 2 02:05:15 PDT 2023
On Fri, 02 Jun 2023 06:23:23 +0100,
Reiji Watanabe <reijiw at google.com> wrote:
>
> Hi Marc,
>
> On Thu, Jun 01, 2023 at 06:02:41AM +0100, Marc Zyngier wrote:
> > Hey Reiji,
> >
> > On Tue, 30 May 2023 13:53:24 +0100,
> > Reiji Watanabe <reijiw at google.com> wrote:
> > >
> > > Hi Marc,
> > >
> > > On Mon, May 29, 2023 at 02:39:28PM +0100, Marc Zyngier wrote:
> > > > On Sat, 27 May 2023 05:02:32 +0100,
> > > > Reiji Watanabe <reijiw at google.com> wrote:
> > > > >
> > > > > This series fixes issues with PMUVer handling for a guest with
> > > > > PMU configured on heterogeneous PMU systems.
> > > > > Specifically, it addresses the following two issues.
> > > > >
> > > > > [A] The default value of ID_AA64DFR0_EL1.PMUVer of the vCPU is set
> > > > > to its sanitized value. This could be inappropriate on
> > > > > heterogeneous PMU systems, as arm64_ftr_bits for PMUVer is defined
> > > > > as FTR_EXACT with safe_val == 0 (when ID_AA64DFR0_EL1.PMUVer of all
> > > > > PEs on the host is not uniform, the sanitized value will be 0).
> > > >
> > > > Why is this a problem? The CPUs don't implement the same version of
> > > > the architecture, we don't get a PMU. Why should we try to do anything
> > > > better? I really don't think we should go out or out way and make the
> > > > code more complicated for something that doesn't really exist.
> > >
> > > Even when the CPUs don't implement the same version of the architecture,
> > > if one of them implement PMUv3, KVM advertises KVM_CAP_ARM_PMU_V3,
> > > and allows userspace to configure PMU (KVM_ARM_VCPU_PMU_V3) for vCPUs.
> >
> > Ah, I see it now. The kernel will register the PMU even if it decides
> > that advertising it is wrong, and then we pick it up. Great :-/.
> >
> > > In this case, although KVM provides PMU emulations for the guest,
> > > the guest's ID_AA64DFR0_EL1.PMUVer will be zero. Also,
> > > KVM_SET_ONE_REG for ID_AA64DFR0_EL1 will never work for vCPUs
> > > with PMU configured on such systems (since KVM also doesn't allow
> > > userspace to set the PMUVer to 0 for the vCPUs with PMU configured).
> > >
> > > I would think either ID_AA64DFR0_EL1.PMUVer for the guest should
> > > indicate PMUv3, or KVM should not allow userspace to configure PMU,
> > > in this case.
> >
> > My vote is on the latter. Even if a PMU is available, we should rely
> > on the feature exposed by the kernel to decide whether exposing a PMU
> > or not.
> >
> > To be honest, this will affect almost nobody (I only know of a single
> > one, an obscure ARMv8.0+ARMv8.2 system which is very unlikely to ever
> > use KVM). I'm happy to take the responsibility to actively break those.
>
> Thank you for the information! Just curious, how about a mix of
> cores with and without PMU ? (with the same ARMv8.x version)
> I'm guessing there are very few if any though :)
I don't know of any. Similar things for IMPDEF PMUs. And to be honest,
I'd be very tempted to nuke that in KVM as well, because this is one
of the worse decision I ever made.
> > > This series is a fix for the former, mainly to keep the current
> > > behavior of KVM_CAP_ARM_PMU_V3 and KVM_ARM_VCPU_PMU_V3 on such
> > > systems, since I wasn't sure if such systems don't really exist :)
> > > (Also, I plan to implement a similar fix for PMCR_EL0.N on top of
> > > those changes)
> > >
> > > I could make a fix for the latter instead though. What do you think ?
> >
> > I think this would be valuable.
>
> Thank you for the comment! I will go with the latter.
Thanks.
> > Also, didn't you have patches for the EL0 side of the PMU? I've been
> > trying to look for a new version, but couldn't find it...
>
> While I'm working on fixing the series based on the recent comment from
> Oliver (https://lore.kernel.org/all/ZG%2Fw95pYjWnMJB62@linux.dev/),
> I have a new PMU EL0 issue, which blocked my testing of the series.
> So, I am debugging the new PMU EL0 issue.
>
> It appears that arch_perf_update_userpage() defined in
> drivers/perf/arm_pmuv3.c isn't used, and instead, the weak one in
> kernel/events/core.c is used.
Wut??? How comes? /me disassembles the kernel:
ffff8000082a1ab0 <arch_perf_update_userpage>:
ffff8000082a1ab0: d503201f nop
ffff8000082a1ab4: d503201f nop
ffff8000082a1ab8: d65f03c0 ret
ffff8000082a1abc: d503201f nop
ffff8000082a1ac0: d503201f nop
ffff8000082a1ac4: d503201f nop
What the hell is happening here???
> This prevents cap_user_rdpmc (, etc)
> from being set (This prevented my test program from directly
> accessing counters). This seems to be caused by the commit 7755cec63ade
> ("arm64: perf: Move PMUv3 driver to drivers/perf").
It is becoming more puzzling by the minute.
>
> I have not yet figured out why the one in arm_pmuv3.c isn't used
> though (The weak one in core.c seems to take precedence over strong
> ones under drivers/ somehow...).
>
> Anyway, I worked around the new issue for now, and ran the test for
> my series though. I will post the new version of the EL0 series
> tomorrow hopefully.
I have a "fix" for this. It doesn't make any sense, but it seems to
work here (GCC 10.2.1 from Debian). Can you please give it a shot?
Thanks,
M.
From 236ac26bd0e03bf2ca3b40471b61a35b02272662 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz at kernel.org>
Date: Fri, 2 Jun 2023 09:52:25 +0100
Subject: [PATCH] perf/core: Drop __weak attribute on arch-specific prototypes
Reiji reports that the arm64 implementation of arch_perf_update_userpage()
is now ignored and replaced by the dummy stub in core code.
This seems to happen since the PMUv3 driver was moved to driver/perf.
As it turns out, dropping the __weak attribute from the *prototype*
of the function solves the problem. You're right, this doesn't seem
to make much sense. And yet...
With this, arm64 is able to enjoy arch_perf_update_userpage() again.
And while we're at it, drop the same __weak attribute from the
arch_perf_get_page_size() prototype.
Reported-by: Reiji Watanabe <reijiw at google.com>
Signed-off-by: Marc Zyngier <maz at kernel.org>
---
include/linux/perf_event.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index d5628a7b5eaa..1509aea69a16 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1845,12 +1845,12 @@ int perf_event_exit_cpu(unsigned int cpu);
#define perf_event_exit_cpu NULL
#endif
-extern void __weak arch_perf_update_userpage(struct perf_event *event,
- struct perf_event_mmap_page *userpg,
- u64 now);
+extern void arch_perf_update_userpage(struct perf_event *event,
+ struct perf_event_mmap_page *userpg,
+ u64 now);
#ifdef CONFIG_MMU
-extern __weak u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr);
+extern u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr);
#endif
/*
--
2.39.2
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list