[PATCH v2 10/61] kernel/fork: Use maple tree for dup_mmap() during forking
Hillf Danton
hdanton at sina.com
Thu Aug 19 21:04:55 PDT 2021
On Thu, 19 Aug 2021 13:32:58 +0000 Liam R. Howlett wrote:
>* Hillf Danton <hdanton at sina.com> [210818 22:01]:
>> On Wed, 18 Aug 2021 14:54:29 +0000 Liam R. Howlett wrote:
>> >* Hillf Danton <hdanton at sina.com> [210818 04:36]:
>> >> On Tue, 17 Aug 2021 15:47:11 +0000 Liam R. Howlett wrote:
>> >> >
>> >> > static inline void mmget(struct mm_struct *mm)
>> >> > {
>> >> > + mt_set_in_rcu(&mm->mm_mt);
>> >> > atomic_inc(&mm->mm_users);
>> >> > }
>> >> >
>> >> > static inline bool mmget_not_zero(struct mm_struct *mm)
>> >> > {
>> >> > + /*
>> >> > + * There is a race below during task tear down that can cause the =
>maple
>> >> > + * tree to enter rcu mode with only a single user. If this race
>> >> > + * happens, the result would be that the maple tree nodes would re=
>main
>> >> > + * active for an extra RCU read cycle.
>> >> > + */
>> >> > + mt_set_in_rcu(&mm->mm_mt);
>> >> > return atomic_inc_not_zero(&mm->mm_users);
>> >> > }
>> >>
>> >> Nit, leave the mm with zero refcount intact.
>> >>
>> >> if (atomic_inc_not_zero(&mm->mm_users)) {
>> >> mt_set_in_rcu(&mm->mm_mt);
>> >> return true;
>> >> }
>> >> return false;
>> >
>> >Thanks for looking at this.
>> >
>> >I thought about that, but came up with the following scenario:
>> >
>> >thread 1 thread 2
>> >mmget(mm)
>> > mmget_not_zero() enter..
>> > atomic_inc_not_zero(&mm->mm_users)
>> >mmput(mm)
>> > mt_set_in_rcu(&mm->mm_mt);
>> > return true;
>> >
>>=20
>> At first glance, given the above mmget, mmput will not hurt anyone.
>
>In the case above, the maple tree enters RCU mode with a single user.
>This will have the result of nodes being freed in RCU mode which is the
>same result as what happens as it is written, except the racing thread
>wins (in this case). I thought this was what you were trying to fix?
>
>>=20
>> >
>> >So I think the above does not remove the race, but does add instructions
>>=20
>> If the mm refcount drops to one after mmput then it is one before
>> atomic_inc_not_zero() which only ensures the mm is stable afterwards
>> until mmput again.
>
>Right. The race we are worried about is the zero referenced mm. If
>mm->mm_users is safe, then mm->mm_mt is also safe.
>
>>=20
>> >to each call to mmget_not_zero(). I thought the race of having nodes
>> >sitting around for an rcu read cycle was worth the trade off.
>>=20
>> Is one ounce of the mm stability worth two pounds? Or three?
>
>I don't see a stability problem with the way it is written.
On the maple tree side, I see in
[PATCH v2 05/61] Maple Tree: Add new data structure
+ * MAPLE_USE_RCU Operate in read/copy/update mode for multi-readers
<...>
+/**
+ * mt_set_in_rcu() - Switch the tree to RCU safe mode.
+ */
+static inline void mt_set_in_rcu(struct maple_tree *mt)
+{
+ if (mt_in_rcu(mt))
+ return;
+
+ mtree_lock(mt);
+ mt->ma_flags |= MAPLE_USE_RCU;
+ mtree_unlock(mt);
+}
and on the mm side, however, if atomic_inc_not_zero(&mm->mm_users) fails
then who can be the "RCU multi-readers"?
>Your change does not remove the race.
If atomic_inc_not_zero() fails then there is no pre-condition in any form
for race; if it succeeds then the race window is slammed.
>Can you explain how the stability is affected negatively by the way it
>is written?
Hard to find the correct answer without knowing why you prefer to update
the flags for mm->mm_mt with mm->mm_users dropping down to ground.
More information about the maple-tree
mailing list