[PATCH v1] arm64: mm: Permit PTE SW bits to change in live mappings
Peter Xu
peterx at redhat.com
Wed Jun 19 12:04:41 PDT 2024
On Wed, Jun 19, 2024 at 04:58:32PM +0100, Ryan Roberts wrote:
> The code in question is:
>
> if (userfaultfd_pte_wp(vma, ptep_get(vmf->pte))) {
> if (!userfaultfd_wp_async(vma)) {
> pte_unmap_unlock(vmf->pte, vmf->ptl);
> return handle_userfault(vmf, VM_UFFD_WP);
> }
>
> /*
> * Nothing needed (cache flush, TLB invalidations,
> * etc.) because we're only removing the uffd-wp bit,
> * which is completely invisible to the user.
> */
> pte = pte_clear_uffd_wp(ptep_get(vmf->pte));
>
> set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> /*
> * Update this to be prepared for following up CoW
> * handling
> */
> vmf->orig_pte = pte;
> }
>
> Perhaps we should consider a change to the following style as a cleanup?
>
> old_pte = ptep_modify_prot_start(vma, addr, pte);
> ptent = pte_clear_uffd_wp(old_pte);
> ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
You're probably right that at least the access bit seems racy to be set
here, so we may have risk of losing that when a race happened against HW.
Dirty bit shouldn't be a concern in this case due to missing W bit, iiuc.
IMO it's a matter of whether we'd like to "make access bit 100% accurate"
when the race happened, while paying that off with an always slower generic
path. Looks cleaner indeed but maybe not very beneficial in reality.
>
> Regardless, this patch is still a correct and valuable change; arm64 arch
> doesn't care if SW bits are modified in valid mappings so we shouldn't be
> checking for it.
Agreed. Let's keep this discussion separate from the original patch if
that already fixes stuff.
>
> >
> >>
> >> /* creating or taking down mappings is always safe */
> >> if (!pte_valid(__pte(old)) || !pte_valid(__pte(new)))
> >> --
> >> 2.43.0
> >>
> >
> > When looking at this function I found this and caught my attention too:
> >
> > /* live contiguous mappings may not be manipulated at all */
> > if ((old | new) & PTE_CONT)
> > return false;
> >
> > I'm now wondering how cont-ptes work with uffd-wp now for arm64, from
> > either hugetlb or mTHP pov. This check may be relevant here as a start.
>
> When transitioning a block of ptes between cont and non-cont, we transition the
> block through invalid with tlb invalidation. See contpte_convert().
>
> >
> > The other thing is since x86 doesn't have cont-ptes yet, uffd-wp didn't
> > consider that, and there may be things overlooked at least from my side.
> > E.g., consider wr-protect one cont-pte huge pages on hugetlb:
> >
> > static inline pte_t huge_pte_mkuffd_wp(pte_t pte)
> > {
> > return huge_pte_wrprotect(pte_mkuffd_wp(pte));
> > }
> >
> > I think it means so far it won't touch the rest cont-ptes but the 1st. Not
> > sure whether it'll work if write happens on the rest.
>
> I'm not completely sure I follow your point. I think this should work correctly.
> The arm64 huge_pte code knows what size (and level) the huge pte is and spreads
> the passed in pte across all the HW ptes.
What I was considering is about wr-protect a 64K cont-pte entry in arm64:
UFFDIO_WRITEPROTECT -> hugetlb_change_protection() -> huge_pte_mkuffd_wp()
What I'm expecting is huge_pte_mkuffd_wp() would wr-protect all ptes, but
looks not right now. I'm not sure if the HW is able to identify "the whole
64K is wr-protected" in this case, rather than "only the 1st pte is
wr-protected", as IIUC current "pte" points to only the 1st pte entry.
Thanks,
--
Peter Xu
More information about the linux-arm-kernel
mailing list