[PATCH 2/2] KVM: arm64: Try PMD block mappings if PUD mappings are not supported

Tue Sep 8 08:41:27 EDT 2020

On 2020-09-08 13:23, Alexandru Elisei wrote:
> Hi Marc,
> 
> On 9/4/20 10:58 AM, Marc Zyngier wrote:
>> Hi Alex,
>> 
>> On Tue, 01 Sep 2020 14:33:57 +0100,
>> Alexandru Elisei <alexandru.elisei at arm.com> wrote:
>>> When userspace uses hugetlbfs for the VM memory, user_mem_abort() 
>>> tries to
>>> use the same block size to map the faulting IPA in stage 2. If stage 
>>> 2
>>> cannot use the same size mapping because the block size doesn't fit 
>>> in the
>>> memslot or the memslot is not properly aligned, user_mem_abort() will 
>>> fall
>>> back to a page mapping, regardless of the block size. We can do 
>>> better for
>>> PUD backed hugetlbfs by checking if a PMD block mapping is possible 
>>> before
>>> deciding to use a page.
>>> 
>>> vma_pagesize is an unsigned long, use 1UL instead of 1ULL when 
>>> assigning
>>> its value.
>>> 
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei at arm.com>
>>> ---
>>>  arch/arm64/kvm/mmu.c | 19 ++++++++++++++-----
>>>  1 file changed, 14 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>> index 25e7dc52c086..f590f7355cda 100644
>>> --- a/arch/arm64/kvm/mmu.c
>>> +++ b/arch/arm64/kvm/mmu.c
>>> @@ -1871,15 +1871,24 @@ static int user_mem_abort(struct kvm_vcpu 
>>> *vcpu, phys_addr_t fault_ipa,
>>>  	else
>>>  		vma_shift = PAGE_SHIFT;
>>> 
>>> -	vma_pagesize = 1ULL << vma_shift;
>>>  	if (logging_active ||
>>> -	    (vma->vm_flags & VM_PFNMAP) ||
>>> -	    !fault_supports_stage2_huge_mapping(memslot, hva, 
>>> vma_pagesize)) {
>>> +	    (vma->vm_flags & VM_PFNMAP)) {
>>>  		force_pte = true;
>>> -		vma_pagesize = PAGE_SIZE;
>>>  		vma_shift = PAGE_SHIFT;
>>>  	}
>>> 
>>> +	if (vma_shift == PUD_SHIFT &&
>>> +	    !fault_supports_stage2_huge_mapping(memslot, hva, PUD_SIZE))
>>> +		vma_shift = PMD_SHIFT;
>>> +
>>> +	if (vma_shift == PMD_SHIFT &&
>>> +	    !fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
>>> +		force_pte = true;
>>> +		vma_shift = PAGE_SHIFT;
>>> +	}
>>> +
>>> +	vma_pagesize = 1UL << vma_shift;
>>> +
>>>  	/*
>>>  	 * The stage2 has a minimum of 2 level table (For arm64 see
>>>  	 * kvm_arm_setup_stage2()). Hence, we are guaranteed that we can
>>> @@ -1889,7 +1898,7 @@ static int user_mem_abort(struct kvm_vcpu 
>>> *vcpu, phys_addr_t fault_ipa,
>>>  	 */
>>>  	if (vma_pagesize == PMD_SIZE ||
>>>  	    (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
>>> -		gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
>>> +		gfn = (fault_ipa & ~(vma_pagesize - 1)) >> PAGE_SHIFT;
>>>  	mmap_read_unlock(current->mm);
>>> 
>>>  	/* We need minimum second+third level pages */
>> Although this looks like a sensible change, I'm a reluctant to take it
>> at this stage, given that we already have a bunch of patches from Will
>> to change the way we deal with PTs.
>> 
>> Could you look into how this could fit into the new code instead?
> 
> Sure, that sounds very sensible. I'm in the process of reviewing Will's 
> series,
> and after I'm done I'll rebase this on top of his patches and send it
> as v2. Does
> that sound ok to you? Or do you want me to base this patch on one of
> your branches?

Either way is fine (kvmarm/next has his patches). Just let me know
what this is based on when you  post the patches.

         M.
-- 
Jazz is not dead. It just smells funny...