[PATCH 00/12] mm: free retracted page table by RCU

Hugh Dickins hughd at google.com
Mon Jun 5 23:28:27 PDT 2023


On Fri, 2 Jun 2023, Jann Horn wrote:
> On Fri, Jun 2, 2023 at 6:37 AM Hugh Dickins <hughd at google.com> wrote:
> 
> > The most obvious vital thing (in the split ptlock case) is that it
> > remains a struct page with a usable ptl spinlock embedded in it.
> >
> > The question becomes more urgent when/if extending to replacing the
> > pagetable pmd by huge pmd in one go, without any mmap_lock: powerpc
> > wants to deposit the page table for later use even in the shmem/file
> > case (and all arches in the anon case): I did work out the details once
> > before, but I'm not sure whether I would still agree with myself; and was
> > glad to leave replacement out of this series, to revisit some time later.
> >
> > >
> > > So in particular, in handle_pte_fault() we can reach the "if
> > > (unlikely(!pte_same(*vmf->pte, entry)))" with vmf->pte pointing to a
> > > detached zeroed page table, but we're okay with that because in that
> > > case we know that !pte_none(vmf->orig_pte)&&pte_none(*vmf->pte) ,
> > > which implies !pte_same(*vmf->pte, entry) , which means we'll bail
> > > out?
> >
> > There is no current (even at end of series) circumstance in which we
> > could be pointing to a detached page table there; but yes, I want to
> > allow for that, and yes I agree with your analysis.
> 
> Hmm, what am I missing here?

I spent quite a while trying to reconstruct what I had been thinking,
what meaning of "detached" or "there" I had in mind when I asserted so
confidently "There is no current (even at end of series) circumstance
in which we could be pointing to a detached page table there".

But had to give up and get on with more useful work.
Of course you are right, and that is what this series is about.

Hugh

> 
> static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
> {
>   pte_t entry;
> 
>   if (unlikely(pmd_none(*vmf->pmd))) {
>     [not executed]
>   } else {
>     /*
>      * A regular pmd is established and it can't morph into a huge
>      * pmd by anon khugepaged, since that takes mmap_lock in write
>      * mode; but shmem or file collapse to THP could still morph
>      * it into a huge pmd: just retry later if so.
>      */
>     vmf->pte = pte_offset_map_nolock(vmf->vma->vm_mm, vmf->pmd,
>              vmf->address, &vmf->ptl);
>     if (unlikely(!vmf->pte))
>       [not executed]
>     // this reads a present readonly PTE
>     vmf->orig_pte = ptep_get_lockless(vmf->pte);
>     vmf->flags |= FAULT_FLAG_ORIG_PTE_VALID;
> 
>     if (pte_none(vmf->orig_pte)) {
>       [not executed]
>     }
>   }
> 
>   [at this point, a concurrent THP collapse operation detaches the page table]
>   // vmf->pte now points into a detached page table
> 
>   if (!vmf->pte)
>     [not executed]
> 
>   if (!pte_present(vmf->orig_pte))
>     [not executed]
> 
>   if (pte_protnone(vmf->orig_pte) && vma_is_accessible(vmf->vma))
>     [not executed]
> 
>   spin_lock(vmf->ptl);
>   entry = vmf->orig_pte;
>   // vmf->pte still points into a detached page table
>   if (unlikely(!pte_same(*vmf->pte, entry))) {
>     update_mmu_tlb(vmf->vma, vmf->address, vmf->pte);
>     goto unlock;
>   }
>   [...]
> }


More information about the linux-arm-kernel mailing list