[PATCH v2 00/19] arm64: Enable LPA2 support for 4k and 16k pages

Ryan Roberts ryan.roberts at arm.com
Thu Dec 1 08:00:13 PST 2022


On 01/12/2022 13:43, Ard Biesheuvel wrote:
> [...]>>>>>>
>>>>>> 2) 4 FAILING TESTS: Host kernel gets stuck initializing KVM
>>>>>>
>>>>>>   During kernel boot, last console log is "kvm [1]: vgic interrupt IRQ9". All
>>>>>>   failing tests are configured for protected KVM, and are build with LPA2
>>>>>>   support, running on non-LPA2 HW.
>>>>>>
>>>>>
>>>>> I will try to reproduce this locally.
>>
>> It turns out the same issue is hit when running your patches without mine on
>> top. The root cause in both cases is an assumption that kvm_get_parange() makes
>> that ID_AA64MMFR0_EL1_PARANGE_MAX will always be 48 for 4KB and 16KB PAGE_SIZE.
>> That is no longer true with your patches. It causes the VTCR_EL2 to be
>> programmed incorrectly for the host stage2, then on return to the host, bang.
>>
> 
> Right. I made an attempt at replacing 'u32 level' with 's32 level'
> throughout that code, along with some related changes, but I didn't
> spot this issue.
> 
>> This demonstrates that kernel stage1 support for LPA2 depends on kvm support for
>> LPA2, since for protected kvm, the host stage2 needs to be able to id map the
>> full physical range that the host kernel sees prior to deprivilege. So I don't
>> think it's fixable in your series. I have a fix in my series for this.
>>
> 
> The reference to ID_AA64MMFR0_EL1_PARANGE_MAX should be fixable in
> isolation, no? Even if it results in a KVM that cannot use 52-bit PAs
> while the host can.

Well that doesn't really sound like a good solution to me - the VMM can't
control which physical memory it is using so it would be the luck of the draw as
to whether kvm gets passed memory above 48 PA bits. So you would probably have
to explicitly prevent kvm from initializing if you want to keep your kernel LPA2
patches decoupled from mine.

> 
>> I also found another dependency problem (hit by some of the tests that were
>> previously failing at the first issue) where kvm uses its page table library to
>> walk a user space page table created by the kernel (see
>> get_user_mapping_size()). In the case where the kernel creates an LPA2 page
>> table, kvm can't walk it without my patches. I've also added a fix for this to
>> my series.
>>
> 
> OK

Again, I think the only way to work around this without kvm understanding LPA2
is to disable KVM when the kernel is using LPA2.

> 
>>
>>>>>
>>>>>>
>>>>>> 3) 42 FAILING TESTS: Guest kernel never outputs anything to console
>>>>>>
>>>>>>   Host kernel boots fine, and we attempt to launch a guest kernel using kvmtool.
>>>>>>   There is no error reported, but the guest never outputs anything. Haven't
>>>>>>   worked out which config options are common to all failures yet.
>>>>>>
>>>>>
>>>>> This goes a bit beyond what I am currently set up for in terms of
>>>>> testing, but I'm happy to help narrow this down.
>>
>> I don't have a root cause for this yet. I'll try to take a loo this afternoon.
>> Will keep you posted.
>>

OK found it. All these failures are when loading the guest in 'high' memory. My
test environment is allocating a vm with 256M at 2048T (so all memory above 48 bits
IPA) and putting the 64KB/52VA/52PA kernel, dtb and intird there. This works
fine without your changes. I guess as part of your change you have broken 64KB
kernel's ability to create an ID map above 48 bits (which conforms to your
clarification of the boot protocol). I guess this was intentional? It makes it
really hard for me to test >48 bit IA at stage2...

> 
>>
>> Once I have all the tests passing, I'll post my series, then hopefully we can
>> move it all forwards as one?
>>
> 
> That would be great, yes, although my work depends on a sizable rework
> of the early boot code that has seen very little review as of yet.
> 
> So for the time being, let's keep aligned but let's not put any eggs
> in each other's baskets :-)>
>> As part of my debugging, I've got a patch to sort out the tlbi code to support
>> LPA2 properly - I think I raised that comment on one of the patches. Are you
>> happy for me to post as part of my series?
>>
> 
> I wasn't sure where to look tbh. The generic 5-level paging stuff
> seems to work fine - is this specific to KVM?

There are a couple of issues that I spotted; for the range-based tlbi
instructions, when LPA2 is in use (TCR_EL1.DS=1), BaseADDR must be 64KB aligned
(when LPA2 is disabled it only needs to be page aligned). So there is some
forward alignment required using the non-range tbli in __flush_tlb_range(). I
think this would manifest as invalidating the wrong entries if left as-is once
your patches are applied.

The second problem is that __tlbi_level() uses level 0 as the "don't use level
hint" sentinel. I think this will work just fine (if slightly suboptimal) if
left as is, since the higher layers are never passing anything outside the range
[0, 3] at the moment, but if that changed and -1 was passed then it would cause
a bug.





More information about the linux-arm-kernel mailing list