[PATCH v4 12/66] kernel/fork: Use maple tree for dup_mmap() during forking

Vlastimil Babka vbabka at suse.cz
Thu Dec 16 03:09:13 PST 2021

On 12/1/21 15:29, Liam Howlett wrote:
> From: "Liam R. Howlett" <Liam.Howlett at Oracle.com>
> The maple tree was already tracking VMAs in this function by an earlier
> commit, but the rbtree iterator was being used to iterate the list.
> Change the iterator to use a maple tree native iterator and switch to
> the maple tree advanced API to avoid multiple walks of the tree during
> insert operations.  Unexport the now-unused vma_store() function.
> We track whether we need to free the VMAs and tree nodes through RCU
> (ie whether there have been multiple threads that can see the mm_struct
> simultaneously; by pthread(), ptrace() or looking at /proc/$pid/maps).
> This setting is sticky because it's too tricky to decide when it's safe
> to exit RCU mode.

I don't immediately see why enabling the RCU tracking in mmget is part of
the dup_mmap() change?

> For performance reasons we bulk allocate the maple tree nodes.  The node
> calculations are done internally to the tree and use the VMA count and
> assume the worst-case node requirements.  The VM_DONT_COPY flag does
> not allow for the most efficient copy method of the tree and so a bulk
> loading algorithm is used.
> Signed-off-by: Liam R. Howlett <Liam.Howlett at Oracle.com>
> Signed-off-by: Matthew Wilcox (Oracle) <willy at infradead.org>

Acked-by: Vlastimil Babka <vbabka at suse.cz>

>  static inline bool mmget_not_zero(struct mm_struct *mm)
>  {
> +	/*
> +	 * There is a race below during task tear down that can cause the maple

What does 'below' refer to here?

> +	 * tree to enter rcu mode with only a single user.  If this race
> +	 * happens, the result would be that the maple tree nodes would remain
> +	 * active for an extra RCU read cycle.
> +	 */
> +	if (!mt_in_rcu(&mm->mm_mt))
> +		mm_set_in_rcu(mm);
>  	return atomic_inc_not_zero(&mm->mm_users);
>  }
> diff --git a/kernel/fork.c b/kernel/fork.c
> index cc9bb95c7678..c9f8465d8ae2 100644

