[PATCH v4 3/4] mm: Optimize mprotect() by PTE-batching

Lorenzo Stoakes lorenzo.stoakes at oracle.com
Tue Jul 1 01:51:26 PDT 2025


On Tue, Jul 01, 2025 at 09:30:51AM +0100, Ryan Roberts wrote:
> >> In an ideal world we would flatten and just have mprotect_folio_pte_batch()
> >> return the batch size considering all the relevant PTE bits AND the
> >> AnonExclusive bit on the pages. IIRC one of Dev's earlier versions modified the
> >> core folio_pte_batch() function to also look at the AnonExclusive bit, but I
> >> really disliked changing that core function (I think others did too?).
> >
> > Yeah let's not change the core function.
> >
> > My suggestion is to have mprotect_folio_pte_batch() do this.
> >
> >>
> >> So barring that approach, we are really only left with the batch and sub-batch
> >> approach - although, yes, it could be abstracted more. We could maintain a
> >> context struct that persists across all calls to mprotect_folio_pte_batch() and
> >> it can use that to keep it's state to remember if we are in the middle of a
> >> sub-batch and decide either to call folio_pte_batch() to get a new batch, or
> >> call anon_exclusive_batch() to get the next sub-batch within the current batch.
> >> But that started to feel overly abstracted to me.
> >
> > Having this nested batch/sub-batch loop really feels worse. You just get lost in
> > the complexity here very easily.
> >
> > But i"m also not sure we need to maintain _that_ much state?
> >
> > We're already looping over all of the PTEs here, so abstracting _the entire
> > loop_ and all the sub-batch stuff to another function, that is
> > mprotect_folio_pte_batch() I think sensibly, so it handles this for you makes a
> > ton of sense.
>
> So effectively turn mprotect_folio_pte_batch() into an iterator; have a struct
> and a funtion to init the struct for the the number of ptes we want to iterate
> over, then a per iteration function that progressively returns batches?

Is that really necessary though?

Idea is that mprotect_folio_pte_batch() returns the nr_ptes _taking into account
the PAE stuff_.

Would this break anything?

We might need to pass a flag to say 'don't account for this' for prot numa case.

>
> Then we just have a simple loop here that gets the next batch and processes it?



More information about the linux-arm-kernel mailing list