[PATCH v2 01/17] KVM: arm64: Factor out stage 2 page table data from struct kvm

Alexandru Elisei alexandru.elisei at arm.com
Mon Jul 6 11:49:41 EDT 2020


Hi Marc,

On 7/6/20 1:17 PM, Marc Zyngier wrote:
> On 2020-06-25 13:19, Alexandru Elisei wrote:
>> Hi Marc,
>>
>> On 6/16/20 5:18 PM, Marc Zyngier wrote:
>>> Hi Alexandru,
>>> [..]
>>>>> [..]
>>>>>
>>>>>  /**
>>>>> - * kvm_alloc_stage2_pgd - allocate level-1 table for stage-2 translation.
>>>>> - * @kvm:    The KVM struct pointer for the VM.
>>>>> + * kvm_init_stage2_mmu - Initialise a S2 MMU strucrure
>>>>> + * @kvm:    The pointer to the KVM structure
>>>>> + * @mmu:    The pointer to the s2 MMU structure
>>>>>   *
>>>>>   * Allocates only the stage-2 HW PGD level table(s) of size defined by
>>>>> - * stage2_pgd_size(kvm).
>>>>> + * stage2_pgd_size(mmu->kvm).
>>>>>   *
>>>>>   * Note we don't need locking here as this is only called when the VM is
>>>>>   * created, which can only be done once.
>>>>>   */
>>>>> -int kvm_alloc_stage2_pgd(struct kvm *kvm)
>>>>> +int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu)
>>>>>  {
>>>>>      phys_addr_t pgd_phys;
>>>>>      pgd_t *pgd;
>>>>> +    int cpu;
>>>>>
>>>>> -    if (kvm->arch.pgd != NULL) {
>>>>> +    if (mmu->pgd != NULL) {
>>>>>          kvm_err("kvm_arch already initialized?\n");
>>>>>          return -EINVAL;
>>>>>      }
>>>>> @@ -1024,8 +1040,20 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm)
>>>>>      if (WARN_ON(pgd_phys & ~kvm_vttbr_baddr_mask(kvm)))
>>>>>          return -EINVAL;
>>>>
>>>> We don't free the pgd if we get the error above, but we do free it below, if
>>>> allocating last_vcpu_ran fails. Shouldn't we free it in both cases?
>>>
>>> Worth investigating. This code gets majorly revamped in the NV series, so it is
>>> likely that I missed something in the middle.
>>
>> You didn't miss anything, I checked and it's the same in the upstream
>> version of KVM.
>>
>> kvm_arch_init_vm() returns with an error if this functions fails, so it's up to
>> the function to do the clean up. kvm_alloc_pages_exact() returns NULL
>> on error, so
>> at this point we have a valid allocation of physical contiguous pages.
>> Failing to
>> create a VM is not a fatal error for the system, so I'm thinking that maybe we
>> should free those pages for the rest of the system to use. However, this is a
>> minor issue, and the patch isn't supposed to make any functional changes, so it
>> can be probably be left for another patch and not add more to an
>> already big series.
>
> Cool. Will you be posting such patch?

I was considering one, but then I realized there's something that I still don't
understand.

alloc_pages_exact() allocates 2^order pages (where 2^order pages >=
stage2_pgd_size()) via __get_free_pages -> alloc_pages(), then it frees the
unneeded pages at the *end* of the allocation in make_alloc_exact(). So the
beginning of the allocated physical area remains aligned to 2^order pages, and
implicitly to stage2_pgd_size().

But now I can't figure out why kvm_vttbr_baddr_mask() isn't simply defined as
~stage2_pgd_size(). Is it possible for kvm_vttbr_baddr_mask() to return an
alignment larger than the size of the table? I can't seem to find anything to that
effect in ARM arm (but that doesn't mean that I didn't miss anything).

Thanks,
Alex



More information about the linux-arm-kernel mailing list