[RFC PATCH] KVM: arm64: vgic-v3: Cache ICC_CTLR_EL1 and allow lockless read when ready
Salil Mehta
salil.mehta at huawei.com
Mon Oct 13 08:48:42 PDT 2025
HI Marc,
> From: Marc Zyngier <maz at kernel.org>
> Sent: Thursday, October 9, 2025 2:49 PM
> To: salil.mehta at opnsrc.net
[...]
>
> On Wed, 08 Oct 2025 21:19:55 +0100,
> salil.mehta at opnsrc.net wrote:
> >
> > From: Salil Mehta <salil.mehta at huawei.com>
> >
> > [A rough illustration of the problem and the probable solution]
> >
> > Userspace reads of ICC_CTLR_EL1 via KVM device attributes currently
> > takes a slow path that may acquire all vCPU locks. Under workloads
> > that exercise userspace PSCI CPU_ON flows or frequent vCPU resets,
> > this can cause vCPU lock contention in KVM and, in the worst cases, -EBUSY
> returns to userspace.
> >
> > When PSCI CPU_ON and CPU_OFF calls are handled entirely in KVM, these
> > operations are executed under KVM vCPU locks in the host kernel (EL1)
> > and appear atomic to other vCPU threads. In this context, system
> > register accesses are serialized under KVM vCPU locks, ensuring
> > atomicity with respect to other vCPUs. After SMCCC filtering was
> > introduced, PSCI CPU_ON and CPU_OFF calls can now exit to userspace
> > (QEMU). During the handling of PSCI CPU_ON call in userspace, a
> > cpu_reset() is exerted which reads ICC_CTLR_EL1 through KVM device
> > attribute IOCTLs. To avoid transient inconsistency and -EBUSY errors,
> > QEMU is forced to pause all vCPUs before issuing these IOCTLs.
>
> I'm going to repeat in public what I already said in private.
>
> Why does QEMU need to know this? I don't see how this is related to PSCI,
> and outside of save/restore, there is no reason why QEMU should poke at
> this. If QEMU needs fixing, please fix QEMU.
Sure, and I did not disagree with it earlier but because I was not fully sure
so I refrained from replying prematurely here.
>
> Honestly, I don't see why the kernel should even care about this, and I have
> no intention of adopting anything of the sort for something that has all the
> hallmarks of a userspace bug.
I understand your point. So the probable solutions for the problem mentioned
in the patch could be:
1. Remove the KVM device access of ICC_CTLR_EL1 system register during CPU
reset and only sync with KVM during migration at source & destination?
2. if 1 is not acceptable then cache in user space.
3. This KVM shadow register change
IIUC, you've hinted at 1st as the solution. We've discussed 2 as well and as I
understand you don't have much apprehensions about it? And last point 3,
is of course totally rejected.
Hope I got it right?
Many thanks!
Best regards
Salil.
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list