[PATCH v12 04/16] arm64: kvm: allows kvm cpu hotplug

Fri Dec 11 10:00:25 PST 2015

On 12/11/2015 10:28 AM, Marc Zyngier wrote:
> On 11/12/15 08:06, AKASHI Takahiro wrote:
>> Ashwin, Marc,
>>
>> On 12/03/2015 10:58 PM, Marc Zyngier wrote:
>>> On 02/12/15 22:40, Ashwin Chaugule wrote:
>>>> Hello,
>>>>
>>>> On 24 November 2015 at 17:25, Geoff Levand <geoff at infradead.org> wrote:
>>>>> From: AKASHI Takahiro <takahiro.akashi at linaro.org>
>>>>>
>>>>> The current kvm implementation on arm64 does cpu-specific initialization
>>>>> at system boot, and has no way to gracefully shutdown a core in terms of
>>>>> kvm. This prevents, especially, kexec from rebooting the system on a boot
>>>>> core in EL2.
>>>>>
>>>>> This patch adds a cpu tear-down function and also puts an existing cpu-init
>>>>> code into a separate function, kvm_arch_hardware_disable() and
>>>>> kvm_arch_hardware_enable() respectively.
>>>>> We don't need arm64-specific cpu hotplug hook any more.
>>>>>
>>>>> Since this patch modifies common part of code between arm and arm64, one
>>>>> stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
>>>>> compiling errors.
>>>>>
>>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi at linaro.org>
>>>>> ---
>>>>>    arch/arm/include/asm/kvm_host.h   | 10 ++++-
>>>>>    arch/arm/include/asm/kvm_mmu.h    |  1 +
>>>>>    arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
>>>>>    arch/arm/kvm/mmu.c                |  5 +++
>>>>>    arch/arm64/include/asm/kvm_host.h | 16 +++++++-
>>>>>    arch/arm64/include/asm/kvm_mmu.h  |  1 +
>>>>>    arch/arm64/include/asm/virt.h     |  9 +++++
>>>>>    arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
>>>>>    arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
>>>>>    9 files changed, 138 insertions(+), 48 deletions(-)
>>>> [..]
>>>>
>>>>>
>>>>>    static struct notifier_block hyp_init_cpu_pm_nb = {
>>>>> @@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
>>>>>           }
>>>>>
>>>>>           /*
>>>>> -        * Execute the init code on each CPU.
>>>>> -        */
>>>>> -       on_each_cpu(cpu_init_hyp_mode, NULL, 1);
>>>>> -
>>>>> -       /*
>>>>>            * Init HYP view of VGIC
>>>>>            */
>>>>>           err = kvm_vgic_hyp_init();
>>>> With this flow, the cpu_init_hyp_mode() is called only at VM guest
>>>> creation, but vgic_hyp_init() is called at bootup. On a system with
>>>> GICv3, it looks like we end up with bogus values from the ICH_VTR_EL2
>>>> (to get the number of LRs), because we're not reading it from EL2
>>>> anymore.
>> Thank you for pointing this out.
>> Recently I tested my kdump code on hikey, and as hikey(hi6220) has gic-400,
>> I didn't notice this problem.
> Because GIC-400 is a GICv2 implementation, which is entirely MMIO based.
> GICv3 uses some system registers that are only available at EL2, and KVM
> needs some information contained in these registers before being able to
> get initialized.
>
>>> Indeed, this is completely broken (I just reproduced the issue on a
>>> model). I wish this kind of details had been checked earlier, but thanks
>>> for pointing it out.
>>>
>>>> Whats the best way to fix this?
>>>> - Call kvm_arch_hardware_enable() before vgic_hyp_init() and disable later?
>>>> - Fold the VGIC init stuff back into hardware_enable()?
>>> None of that works - kvm_arch_hardware_enable() is called once per CPU,
>>> while vgic_hyp_init() can only be called once. Also,
>>> kvm_arch_hardware_enable() is called from interrupt context, and I
>>> wouldn't feel comfortable starting probing DT and allocating stuff from
>>> there.
>> Do you think so?
>> How about the fixup! patch attached below?
>> The point is that, like Ashwin's first idea, we initialize cpus temporarily
>> before kvm_vgic_hyp_init() and then soon reset cpus again. Thus,
>> kvm cpu hotplug will still continue to work as before.
>> Now that cpu_init_hyp_mode() is revived as exactly the same as Marc's
>> original code, the change will not be a big jump.
> This seems quite complicated:
> - init EL2 on  all CPUs
> - do some initialization
> - tear all CPUs EL2 down
> - let KVM drive the vectors being set or not
>
> My questions are: why do we need to do this on *all* cpus? Can't that
> work on a single one?
>   

Single CPU EL2 initialization should be fine as long as no kernel 
preemption happens
in between init EL2  and  kvm_vgic_hyp_init() execution. The function 
init_hyp_mode()
is called by do_basic_setup() with preemption enabled.

I don't have deeper knowledge of how scheduler is handled during the 
kernel boot
time, but initializing all CPUs definitely helps if preemption happens 
before reading
ICH_VTR_EL2 register and after kvm_vgic_hyp_init().

> Also, the simple fact that we were able to get some junk value is a sign
> that something is amiss. I'd expect a splat of some sort, because we now
> have a possibility of doing things in the wrong context.
>
>> If kvm_hyp_call() in vgic_v3_probe()/kvm_vgic_hyp_init() is a *problem*,
>> I hope this should work. Actually I confirmed that, with this fixup! patch,
>> we could run a kvm guest and also successfully executed kexec on model w/gic-v3.
>>
>> My only concern is the following kernel message I saw when kexec shut down
>> the kernel:
>> (Please note that I was running one kvm quest (pid=961) here.)
>>
>> ===
>> sh-4.3# ./kexec -d -e
>> kexec version: 15.11.16.11.06-g41e52e2
>> arch_process_options:112: command_line: (null)
>> arch_process_options:114: initrd: (null)
>> arch_process_options:115: dtb: (null)
>> arch_process_options:117: port: 0x0
>> kvm: exiting hardware virtualization
>> kvm [961]: Unsupported exception type: 6248304    <== this message
> That makes me feel very uncomfortable. It looks like we've exited a
> guest with some horrible value in X0. How is that even possible?
>
> This deserves to be investigated.
>
> Thanks,
>
> 	M.