[PATCH v1] arm64: mm: Permit PTE SW bits to change in live mappings

Peter Xu peterx at redhat.com
Wed Jun 19 12:04:41 PDT 2024


On Wed, Jun 19, 2024 at 04:58:32PM +0100, Ryan Roberts wrote:
> The code in question is:
> 
> 	if (userfaultfd_pte_wp(vma, ptep_get(vmf->pte))) {
> 		if (!userfaultfd_wp_async(vma)) {
> 			pte_unmap_unlock(vmf->pte, vmf->ptl);
> 			return handle_userfault(vmf, VM_UFFD_WP);
> 		}
> 
> 		/*
> 		 * Nothing needed (cache flush, TLB invalidations,
> 		 * etc.) because we're only removing the uffd-wp bit,
> 		 * which is completely invisible to the user.
> 		 */
> 		pte = pte_clear_uffd_wp(ptep_get(vmf->pte));
> 
> 		set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> 		/*
> 		 * Update this to be prepared for following up CoW
> 		 * handling
> 		 */
> 		vmf->orig_pte = pte;
> 	}
> 
> Perhaps we should consider a change to the following style as a cleanup?
> 
> 	old_pte = ptep_modify_prot_start(vma, addr, pte);
> 	ptent = pte_clear_uffd_wp(old_pte);
> 	ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);

You're probably right that at least the access bit seems racy to be set
here, so we may have risk of losing that when a race happened against HW.
Dirty bit shouldn't be a concern in this case due to missing W bit, iiuc.

IMO it's a matter of whether we'd like to "make access bit 100% accurate"
when the race happened, while paying that off with an always slower generic
path.  Looks cleaner indeed but maybe not very beneficial in reality.

> 
> Regardless, this patch is still a correct and valuable change; arm64 arch
> doesn't care if SW bits are modified in valid mappings so we shouldn't be
> checking for it.

Agreed.  Let's keep this discussion separate from the original patch if
that already fixes stuff.

> 
> > 
> >>
> >>  	/* creating or taking down mappings is always safe */
> >>  	if (!pte_valid(__pte(old)) || !pte_valid(__pte(new)))
> >> --
> >> 2.43.0
> >>
> > 
> > When looking at this function I found this and caught my attention too:
> > 
> > 	/* live contiguous mappings may not be manipulated at all */
> > 	if ((old | new) & PTE_CONT)
> > 		return false;
> > 
> > I'm now wondering how cont-ptes work with uffd-wp now for arm64, from
> > either hugetlb or mTHP pov.  This check may be relevant here as a start.
> 
> When transitioning a block of ptes between cont and non-cont, we transition the
> block through invalid with tlb invalidation. See contpte_convert().
> 
> > 
> > The other thing is since x86 doesn't have cont-ptes yet, uffd-wp didn't
> > consider that, and there may be things overlooked at least from my side.
> > E.g., consider wr-protect one cont-pte huge pages on hugetlb:
> > 
> > static inline pte_t huge_pte_mkuffd_wp(pte_t pte)
> > {
> > 	return huge_pte_wrprotect(pte_mkuffd_wp(pte));
> > }
> > 
> > I think it means so far it won't touch the rest cont-ptes but the 1st.  Not
> > sure whether it'll work if write happens on the rest.
> 
> I'm not completely sure I follow your point. I think this should work correctly.
> The arm64 huge_pte code knows what size (and level) the huge pte is and spreads
> the passed in pte across all the HW ptes.

What I was considering is about wr-protect a 64K cont-pte entry in arm64:

  UFFDIO_WRITEPROTECT -> hugetlb_change_protection() -> huge_pte_mkuffd_wp()

What I'm expecting is huge_pte_mkuffd_wp() would wr-protect all ptes, but
looks not right now.  I'm not sure if the HW is able to identify "the whole
64K is wr-protected" in this case, rather than "only the 1st pte is
wr-protected", as IIUC current "pte" points to only the 1st pte entry.

Thanks,

-- 
Peter Xu




More information about the linux-arm-kernel mailing list