[PATCH 3/5] maple_tree: use vacant nodes to reduce worst case allocations

Wei Yang richard.weiyang at gmail.com
Fri Nov 15 17:41:16 PST 2024


On Fri, Nov 15, 2024 at 03:34:55PM -0500, Sidhartha Kumar wrote:
>On 11/15/24 2:52 AM, Wei Yang wrote:
>> On Thu, Nov 14, 2024 at 12:05:22PM -0500, Sidhartha Kumar wrote:
>> > In order to determine the store type for a maple tree operation, a walk
>> > of the tree is done through mas_wr_walk(). This function descends the
>> > tree until a spanning write is detected or we reach a leaf node. While
>> > descending, keep track of the height at which we encounter a node with
>> > available space. This is done by checking if mas->end is less than the
>> > number of slots a given node type can fit.
>> > 
>> > Now that the height of the vacant node is tracked, we can use the
>> > difference between the height of the tree and the height of the vacant
>> > node to know how many levels we will have to propagate creating new
>> > nodes. Update mas_prealloc_calc() to consider the vacant height and
>> > reduce the number of worst allocations.
>> > 
>> > Rebalancing stores are not supported and fall back to using the full
>> > height of the tree for allocations.
>> > 
>> > Update preallocation testing assertions to take into account vacant
>> > height.
>> > 
>> > Signed-off-by: Sidhartha <sidhartha.kumar at oracle.com>
>> > ---
>> > include/linux/maple_tree.h       |  2 +
>> > lib/maple_tree.c                 | 13 +++--
>> > tools/testing/radix-tree/maple.c | 97 +++++++++++++++++++++++++++++---
>> > 3 files changed, 100 insertions(+), 12 deletions(-)
>> > 
>> > diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
>> > index cbbcd18d4186..7d777aa2d9ed 100644
>> > --- a/include/linux/maple_tree.h
>> > +++ b/include/linux/maple_tree.h
>> > @@ -463,6 +463,7 @@ struct ma_wr_state {
>> > 	void __rcu **slots;		/* mas->node->slots pointer */
>> > 	void *entry;			/* The entry to write */
>> > 	void *content;			/* The existing entry that is being overwritten */
>> > +	unsigned char vacant_height;	/* Depth of lowest node with free space */
>>                               ^^^           ^^^
>> 
>> Would this be a little misleading?
>> 
>
>Could you elaborate on how its misleading?
>

As you mentioned in previous patch, depth and height has different meaning.

Root node has depth of 0 and height of 1. So I may wandering whether this is
depth or height.

>> > };
>> > 
>> > #define mas_lock(mas)           spin_lock(&((mas)->tree->ma_lock))
>> > @@ -498,6 +499,7 @@ struct ma_wr_state {
>> > 		.mas = ma_state,					\
>> > 		.content = NULL,					\
>> > 		.entry = wr_entry,					\
>> > +		.vacant_height = 0					\
>> > 	}
>> > 
>> > #define MA_TOPIARY(name, tree)						\
>> > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>> > index 21289e350382..f14d70c171c2 100644
>> > --- a/lib/maple_tree.c
>> > +++ b/lib/maple_tree.c
>> > @@ -3545,6 +3545,9 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas)
>> > 		if (ma_is_leaf(wr_mas->type))
>> > 			return true;
>> > 
>> > +		if (mas->end < mt_slots[wr_mas->type] - 1)
>> > +			wr_mas->vacant_height = mas->depth + 1;
>> 
>> For some cases in rebalance, we may split data into three parts, which means
>> we need 2 extra vacant slot.
>> 
>> Maybe this check is not accurate?
>> 
>
>The triple split scenario which you are describing comes from the spanning
>store case not on the wr_rebalance case. There is a check before we set
>vacant height to return if is_span_wr() so I believe this is correct still.
>

Hmm... I come up with a case in which vacant_height may not high enough.

Here is the subtree where spanning write locates. The first level is the
parent node of the second level nodes.

                vacant node
                +--------+-+-+-------+
                |        |l|r|       |
                +--------+-+-+-------+

                l                 r                 
    +------+    +----+-------+    +---------+--+    +------+
    |      |    |    |       |    |         |  |    |      |
    +------+    +----+-------+    +---------+--+    +------+
                     ^                      ^
		     |                      |
		   index                   last

When mas_wr_walk_descend() to node l, mas_is_span_wr() return true since last
is in the right sibling, node r. Let's say the parent is vacant and l/r is
leaf. So the request number is (1 * 3) + 1.

Let's assume:

  * vacant node is just sufficient
  * l's left part + r's right part is sufficient but not overflow

Then the "new vacant node" would be deficient and needs another round
rebalance.

For this case, I am afraid it doesn't allocate enough nodes. Or I
misunderstand this?

-- 
Wei Yang
Help you, Help me



More information about the maple-tree mailing list