[RFC PATCH 0/5] KVM: arm64: Add pvtime LPT support

Steven Price steven.price at arm.com
Wed Sep 2 06:09:04 EDT 2020


Hi Marc,

Sorry for the slow response, I've been on holiday.

On 22/08/2020 11:31, Marc Zyngier wrote:
> Hi Steven,
> 
> On Wed, 19 Aug 2020 09:54:40 +0100,
> Steven Price <steven.price at arm.com> wrote:
>>
>> On 18/08/2020 15:41, Marc Zyngier wrote:
>>> On 2020-08-17 09:41, Keqian Zhu wrote:
[...]
>>>>
>>>> Things need concern:
>>>> 1. https://developer.arm.com/docs/den0057/a needs update.
>>>
>>> LPT was explicitly removed from the spec because it doesn't really
>>> solve the problem, specially for the firmware: EFI knows
>>> nothing about this, for example. How is it going to work?
>>> Also, nobody was ever able to explain how this would work for
>>> nested virt.
>>>
>>> ARMv8.4 and ARMv8.6 have the feature set that is required to solve
>>> this problem without adding more PV to the kernel.
>>
>> Hi Marc,
>>
>> These are good points, however we do still have the situation that
>> CPUs that don't have ARMv8.4/8.6 clearly cannot implement this. I
>> presume the use-case Keqian is looking at predates the necessary
>> support in the CPU - Keqian if you can provide more details on the
>> architecture(s) involved that would be helpful.
> 
> My take on this is that it is a fictional use case. In my experience,
> migration happens across *identical* systems, and *any* difference
> visible to guests will cause things to go wrong. Errata management
> gets in the way, as usual (name *one* integration that isn't broken
> one way or another!).

Keqian appears to have a use case - but obviously I don't know the 
details. I guess Keqian needs to convince you of that.

> Allowing migration across heterogeneous hosts requires a solution to
> the errata management problem, which everyone (including me) has
> decided to ignore so far (and I claim that not having a constant timer
> frequency exposed to guests is an architecture bug).

I agree - errata management needs to be solved before LPT. Between 
restricted subsets of hosts this doesn't seem impossible, but I guess we 
should stall LPT until a credible solution is proposed. I'm certainly 
not proposing one at the moment.

>> Nested virt is indeed more of an issue - we did have some ideas around
>> using SDEI that never made it to the spec.
> 
> SDEI? Sigh... Why would SDEI be useful for NV and not for !NV?

SDEI provides a way of injecting a synchronous exception on migration - 
although that certainly isn't the only possible mechanism. For NV we 
have the problem that a guest-guest may be running at the point of 
migration. However it's not practical for the host hypervisor to provide 
the necessary table directly to the guest-guest which means the 
guest-hypervisor must update the tables before the guest-guest is 
allowed to run on the new host. The only plausible route I could see for 
this is injecting a synchronous exception into the guest (per VCPU) to 
ensure any guest-guests running are exited at migration time.

!NV is easier because we don't have to worry about multiple levels of 
para-virtualisation.

>> However I would argue that the most pragmatic approach would be to
>> not support the combination of nested virt and LPT. Hopefully that
>> can wait until the counter scaling support is available and not
>> require PV.
> 
> And have yet another set of band aids that paper over the fact that we
> can't get a consistent story on virtualization? No, thank you.
> 
> NV is (IMHO) much more important than LPT as it has a chance of
> getting used. LPT is just another tick box, and the fact that ARM is
> ready to ignore sideline a decent portion of the architecture is a
> clear sign that it hasn't been thought out.

Different people have different priorities. NV is definitely important 
for many people. LPT may also be important if you've already got a bunch 
of VMs running on machines and you want to be able to (gradually) 
replace them with newer hosts which happen to have a different clock 
frequency. Those VMs running now clearly aren't using NV.

However, I have to admit it's not me that has the use-case, so I'll 
leave it for others who might actually know the specifics to explain the 
details.

>> We are discussing (re-)releasing the spec with the LPT parts added. If
>> you have fundamental objections then please me know.
> 
> I do, see above. I'm stating that the use case doesn't really exist
> given the state of the available HW and the fragmentation of the
> architecture, and that ignoring the most important innovation in the
> virtualization architecture since ARMv7 is at best short-sighted.
> 
> Time scaling is just an instance of the errata management problem, and
> that is the issue that needs solving. Papering over part of the
> problem is not helping.

I fully agree - errata management is definitely the first step that 
needs solving. This is why I abandoned LPT originally because I don't 
have a generic solution and the testing I did involved really ugly hacks 
just to make the migration possible.

For now I propose we (again) park LPT until some progress has been made 
on errata management.

Thanks,

Steve



More information about the linux-arm-kernel mailing list