[PATCH v10 00/59] KVM: arm64: ARMv8.3/8.4 Nested Virtualization support

Miguel Luis miguel.luis at oracle.com
Tue Jul 18 03:29:23 PDT 2023


Hi Marc,

> On 10 Jul 2023, at 12:56, Miguel Luis <miguel.luis at oracle.com> wrote:
> 
> Hi Marc,
> 
>> On 29 Jun 2023, at 07:03, Marc Zyngier <maz at kernel.org> wrote:
>> 
>> Hi Ganapatrao,
>> 
>> On Wed, 28 Jun 2023 07:45:55 +0100,
>> Ganapatrao Kulkarni <gankulkarni at os.amperecomputing.com> wrote:
>>> 
>>> 
>>> Hi Marc,
>>> 
>>> 
>>> On 15-05-2023 11:00 pm, Marc Zyngier wrote:
>>>> This is the 4th drop of NV support on arm64 for this year.
>>>> 
>>>> For the previous episodes, see [1].
>>>> 
>>>> What's changed:
>>>> 
>>>> - New framework to track system register traps that are reinjected in
>>>>  guest EL2. It is expected to replace the discrete handling we have
>>>>  enjoyed so far, which didn't scale at all. This has already fixed a
>>>>  number of bugs that were hidden (a bunch of traps were never
>>>>  forwarded...). Still a work in progress, but this is going in the
>>>>  right direction.
>>>> 
>>>> - Allow the L1 hypervisor to have a S2 that has an input larger than
>>>>  the L0 IPA space. This fixes a number of subtle issues, depending on
>>>>  how the initial guest was created.
>>>> 
>>>> - Consequently, the patch series has gone longer again. Boo. But
>>>>  hopefully some of it is easier to review...
>>>> 
>>> 
>>> I am facing issue in booting NestedVM with V9 as well with 10 patchset.
>>> 
>>> I have tried V9/V10 on Ampere platform using kvmtool and I could boot
>>> Guest-Hypervisor and then NestedVM without any issue.
>>> However when I try to boot using QEMU(not using EDK2/EFI),
>>> Guest-Hypervisor is booted with Fedora 37 using virtio disk. From
>>> Guest-Hypervisor console(or ssh shell), If I try to boot NestedVM,
>>> boot hangs very early stage of the boot.
>>> 
>>> I did some debug using ftrace and it seems the Guest-Hypervisor is
>>> getting very high rate of arch-timer interrupts,
>>> due to that all CPU time is going on in serving the Guest-Hypervisor
>>> and it is never going back to NestedVM.
>>> 
>>> I am using QEMU vanilla version v7.2.0 with top-up patches for NV [1]
>> 
>> So I went ahead and gave QEMU a go. On my systems, *nothing* works (I
>> cannot even boot a L1 with 'virtualization=on" (the guest is stuck at
>> the point where virtio gets probed and waits for its first interrupt).

In order to use the previous patches you need to update the linux headers
of QEMU according to the target kernel you’re testing. So you would want to run
./scripts/update-linux-headers.sh <kernel src dir> <qemu src dir> in the place of
patch 1 then you should be able to boot the L1 guest with virtualization=on.

Regarding the L2 guest, it does not boot and I’m in the process of
understanding why. The previous patches had some improvements to make
but I couldn’t relate them to not booting the L2 guest, yet.

Eric stated an issue with NV and SVE enablement which I’m still looking at.
( which is similar to the commit
5b578f088ada3c4319f7274c0221b5d92143fe6a in your kvmtool branch arm64/nv-5.16 )

As a test I’ve disabled SVE on the kernel side with 'arm64.nosve’ and
the output matches the one from your kvmtool:

[    0.000000] CPU features: SYS_ID_AA64PFR0_EL1[35:32]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[59:56]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[55:52]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[47:44]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[43:40]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[35:32]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[23:20]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[19:16]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[7:4]: already set to 0
[    0.000000] CPU features: SYS_ID_AA64ZFR0_EL1[3:0]: already set to 0

That didn’t enabled the L2 guest to boot on QEMU so the issue feels still in a grey area.

As a baseline I tested your kvmtool branch for 5.16 after updating the includes,
and as I expected both L1 and L2 guests boot.

Miguel

>> 
>> Worse, booting a hVHE guest results in QEMU generating an assert as it
>> tries to inject an interrupt using the QEMU GICv3 model, something
>> that should *NEVER* be in use with KVM.
>> 
>> With help from Eric, I got to a point where the hVHE guest could boot
>> as long as I kept injecting console interrupts, which is again a
>> symptom of the vGIC not being used.
>> 
>> So something is *majorly* wrong with the QEMU patches. I don't know
>> what makes it possible for you to even boot the L1 - if the GIC is
>> external, injecting an interrupt in the L2 is simply impossible.
>> 
>> Miguel, can you please investigate this?
> 
> Yes, I will investigate it. Sorry for the delay in replying as I took a break
> short after KVM forum and I’ve just started to sync up.
> 
> Thanks,
> Miguel
> 
>> 
>> In the meantime, I'll add some code to the kernel side to refuse the
>> external interrupt controller configuration with NV. Hopefully that
>> will lead to some clues about what is going on.
>> 
>> Thanks,
>> 
>> M.
>> 
>> -- 
>> Without deviation from the norm, progress is not possible.




More information about the linux-arm-kernel mailing list