[PATCH v2 10/61] kernel/fork: Use maple tree for dup_mmap() during forking

Liam Howlett liam.howlett at oracle.com
Thu Aug 19 06:32:58 PDT 2021


* Hillf Danton <hdanton at sina.com> [210818 22:01]:
> On Wed, 18 Aug 2021 14:54:29 +0000 Liam R. Howlett wrote:
> >* Hillf Danton <hdanton at sina.com> [210818 04:36]:
> >> On Tue, 17 Aug 2021 15:47:11 +0000 Liam R. Howlett wrote:
> >> >
> >> >  static inline void mmget(struct mm_struct *mm)
> >> >  {
> >> > +	mt_set_in_rcu(&mm->mm_mt);
> >> >  	atomic_inc(&mm->mm_users);
> >> >  }
> >> >
> >> >  static inline bool mmget_not_zero(struct mm_struct *mm)
> >> >  {
> >> > +	/*
> >> > +	 * There is a race below during task tear down that can cause the maple
> >> > +	 * tree to enter rcu mode with only a single user.  If this race
> >> > +	 * happens, the result would be that the maple tree nodes would remain
> >> > +	 * active for an extra RCU read cycle.
> >> > +	 */
> >> > +	mt_set_in_rcu(&mm->mm_mt);
> >> >  	return atomic_inc_not_zero(&mm->mm_users);
> >> >  }
> >>
> >> Nit, leave the mm with zero refcount intact.
> >>
> >>  	if (atomic_inc_not_zero(&mm->mm_users)) {
> >> 		mt_set_in_rcu(&mm->mm_mt);
> >> 		return true;
> >> 	}
> >> 	return false;
> >
> >Thanks for looking at this.
> >
> >I thought about that, but came up with the following scenario:
> >
> >thread 1	thread 2
> >mmget(mm)
> >		mmget_not_zero() enter..
> >		atomic_inc_not_zero(&mm->mm_users)
> >mmput(mm)
> >		mt_set_in_rcu(&mm->mm_mt);
> >		return true;
> >
> 
> At first glance, given the above mmget, mmput will not hurt anyone.

In the case above, the maple tree enters RCU mode with a single user.
This will have the result of nodes being freed in RCU mode which is the
same result as what happens as it is written, except the racing thread
wins (in this case).  I thought this was what you were trying to fix?

> 
> >
> >So I think the above does not remove the race, but does add instructions
> 
> If the mm refcount drops to one after mmput then it is one before
> atomic_inc_not_zero() which only ensures the mm is stable afterwards
> until mmput again.

Right.  The race we are worried about is the zero referenced mm.  If
mm->mm_users is safe, then mm->mm_mt is also safe.

> 
> >to each call to mmget_not_zero().  I thought the race of having nodes
> >sitting around for an rcu read cycle was worth the trade off.
> 
> Is one ounce of the mm stability worth two pounds? Or three?

I don't see a stability problem with the way it is written.  Your change
does not remove the race.  Can you explain how the stability is affected
negatively by the way it is written?

Thanks,
Liam


More information about the maple-tree mailing list