[PATCH 3/4] maple_tree: use the correct min to calculate split

Wei Yang richard.weiyang at gmail.com
Thu Nov 7 18:55:42 PST 2024


On Sun, Oct 20, 2024 at 06:00:35PM -0400, Liam R. Howlett wrote:
>* Wei Yang <richard.weiyang at gmail.com> [241019 22:46]:
>> The check in mab_calc_split() is to make sure the gap between
>> [0, split] won't be too small. But we don't pass the correct min.
>> 
>> First we may encounter a pivot[split] smaller than min. For example:
>> 
>>        mt_init_flags(mt, 0);
>>        for (count = 0; count <= 240; count++) {
>>                mas_set(&mas, count);
>>                mas_store(&mas, xa_mk_value(count));
>>        }
>> 
>> On splitting root for storing 240, the pivot[split] is smaller than min.
>> This result a huge (pivot[split] - min).
>
>This is fine.
>
>There is an open work item to make it more accurate at higher levels,
>but it's not a problem as it is.
>
>Each level upwards needs a better 'minimum span', meaning that the node
>should have at least mas.min - mas.min + level * something.

Don't follow this. 

First, I guess it is mas.max - mas.min?

Then there is sth related to level. But I don't see level is used for
calculation. Would you mind sharing more your original thought?

>
>It works today for leaves, somewhat.
>

To me it works by coincidence.

The condition check is:

  (bn->pivot[split] - min) < (slot_count - 1) 

while slot_count is a constent, 16 for leaf node.

And we always pass 0 as min, the condition for leaf is:

  bn->pivot[split] < 15

For the mmap world, per my understanding, we won't expect an address is smaller
than 15.

>> 
>> Second prev_l_mas.min is not initialized for the first iteration. This
>> means on splitting leaf node, this value is mostly taking no effect.
>
>No, it is set to 0.  Not initialized would cause random data loss.
>
>See MA_STATE() in the header.
>

Right, I want to say not initialized to the value we want.

Will be more careful next time.

>> 
>> Since we are now calculating the split of mas->node, we should use the
>> mas->min instead of prev_l_mas.min.
>
>This sounds reasonable, but considering what this number is used for, I
>don't see how it is working as well as it is today.  I will need to look
>deeper into this.
>
>> 
>> Signed-off-by: Wei Yang <richard.weiyang at gmail.com>
>> CC: Liam R. Howlett <Liam.Howlett at Oracle.com>
>> CC: Sidhartha Kumar <sidhartha.kumar at oracle.com>
>> CC: Lorenzo Stoakes <lorenzo.stoakes at oracle.com>
>> ---
>>  lib/maple_tree.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>> index 894dc5e9436e..c2d4b188646c 100644
>> --- a/lib/maple_tree.c
>> +++ b/lib/maple_tree.c
>> @@ -3357,7 +3357,7 @@ static void mas_split(struct ma_state *mas, struct maple_big_node *b_node)
>>  		if (mas_push_data(mas, height, &mast, false))
>>  			break;
>>  
>> -		split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min);
>> +		split = mab_calc_split(mas, b_node, &mid_split, mas->min);
>>  		mast_split_data(&mast, mas, split);
>>  		/*
>>  		 * Usually correct, mab_mas_cp in the above call overwrites
>> -- 
>> 2.34.1
>> 

-- 
Wei Yang
Help you, Help me



More information about the maple-tree mailing list