[PATCH v2 03/11] KVM: arm64: Make kvm_skip_instr() and co private to HYP
Marc Zyngier
maz at kernel.org
Thu May 6 07:29:37 PDT 2021
On Thu, 06 May 2021 12:43:26 +0100,
Zenghui Yu <yuzenghui at huawei.com> wrote:
>
> On 2021/5/6 14:33, Marc Zyngier wrote:
> > On Wed, 05 May 2021 17:46:51 +0100,
> > Marc Zyngier <maz at kernel.org> wrote:
> >>
> >> Hi Zenghui,
> >>
> >> On Wed, 05 May 2021 15:23:02 +0100,
> >> Zenghui Yu <yuzenghui at huawei.com> wrote:
> >>>
> >>> Hi Marc,
> >>>
> >>> On 2020/11/3 0:40, Marc Zyngier wrote:
> >>>> In an effort to remove the vcpu PC manipulations from EL1 on nVHE
> >>>> systems, move kvm_skip_instr() to be HYP-specific. EL1's intent
> >>>> to increment PC post emulation is now signalled via a flag in the
> >>>> vcpu structure.
> >>>>
> >>>> Signed-off-by: Marc Zyngier <maz at kernel.org>
> >>>
> >>> [...]
> >>>
> >>>> @@ -133,6 +134,8 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
> >>>> __load_guest_stage2(vcpu->arch.hw_mmu);
> >>>> __activate_traps(vcpu);
> >>>> + __adjust_pc(vcpu);
> >>>
> >>> If the INCREMENT_PC flag was set (e.g., for WFx emulation) while we're
> >>> handling PSCI CPU_ON call targetting this VCPU, the *target_pc* (aka
> >>> entry point address, normally provided by the primary VCPU) will be
> >>> unexpectedly incremented here. That's pretty bad, I think.
> >>
> >> How can you online a CPU using PSCI if that CPU is currently spinning
> >> on a WFI? Or is that we have transitioned via userspace to perform the
> >> vcpu reset? I can imagine it happening in that case.
>
> I hadn't tried to reset VCPU from userspace. That would be a much easier
> way to reproduce this problem.
Then I don't understand how you end-up there. If the vcpu was in WFI,
it wasn't off and PSCI_CPU_ON doesn't have any effect.
> > Actually, this is far worse than it looks, and this only papers over
> > one particular symptom. We need to resolve all pending PC updates
> > *before* returning to userspace, or things like live migration can
> > observe an inconsistent state.
>
> Ah yeah, agreed.
>
> Apart from the PC manipulation, I noticed that when handling the user
> GET_VCPU_EVENTS request:
>
> | /*
> | * We never return a pending ext_dabt here because we deliver it to
> | * the virtual CPU directly when setting the event and it's no longer
> | * 'pending' at this point.
> | */
>
> Which isn't true anymore now that we defer the exception injection right
> before the VCPU entry.
I believe the comment will be valid again once I fix the core issue,
which is that we shouldn't return to userspace with pending PC
adjustments. As long as KVM_GET_VCPU_EVENTS isn't issued on a running
vcpu (which looks pointless to me), this should be just fine.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
More information about the linux-arm-kernel
mailing list