[PATCH] KVM: arm64: Don't split hugepages outside of MMU write lock

Oliver Upton oupton at google.com
Fri Apr 1 07:05:31 PDT 2022


Hi Reiji,

Thanks for the review!

On Thu, Mar 31, 2022 at 11:07:23PM -0700, Reiji Watanabe wrote:
> > @@ -1267,10 +1268,24 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >          */
> >         if (fault_status == FSC_PERM && vma_pagesize == fault_granule) {
> >                 ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
> 
> When use_read_lock is set to true, it appears the above condition will
> become always true (since fault_granule is PAGE_SIZE and force_pte
> is true in this case).  So, I don't think the following "else" changes
> really make any difference.  (Or am I overlooking something?)
> Looking at the code, I doubt that even the original (before the regression)
> code detects the case that is described in the comment below in the
> first place.

Yes, you are right.

I liked the explicit check against !use_read_lock (even if it is dead)
to make it abundantly clear what is guarded by the write lock.
Personally, I'm not a big fan of the conditional locking because it is
hard to tell exactly what is protected by the read or write lock.

Maybe instead a WARN() could suffice. That way we bark at anyone who
changes MMU locking again and breaks it. I found the bug with the splat
above, but warning about replacing an already present PTE is rather far
removed from the smoking gun in this situation.

Outside of the regression, I believe there are some subtle races where
stage-2 is modified before taking the MMU lock. I'm going to think
further about the implications of these, but perhaps we should pass
through a page level argument to kvm_pagetable_stage2_relax_perms() and
only do the operation if the paging structures match what is reported in
the FSC. If the IPA is not in fact mapped at the specified page level,
retry the faulting instruction.

I'll get a patch up that addresses your comment and crams a WARN() in to
assert the write lock is held.

--
Thanks,
Oliver



More information about the linux-arm-kernel mailing list