[PATCH 00/14] Reduce preallocations for maple tree

Sun Jun 4 23:18:17 PDT 2023


On 6/5/2023 12:41 PM, Yin Fengwei wrote:
> Hi Peng,
> 
> On 6/5/23 11:28, Peng Zhang wrote:
>>
>>
>> 在 2023/6/2 16:10, Yin, Fengwei 写道:
>>> Hi Liam,
>>>
>>> On 6/1/2023 10:15 AM, Liam R. Howlett wrote:
>>>> Initial work on preallocations showed no regression in performance
>>>> during testing, but recently some users (both on [1] and off [android]
>>>> list) have reported that preallocating the worst-case number of nodes
>>>> has caused some slow down.  This patch set addresses the number of
>>>> allocations in a few ways.
>>>>
>>>> During munmap() most munmap() operations will remove a single VMA, so
>>>> leverage the fact that the maple tree can place a single pointer at
>>>> range 0 - 0 without allocating.  This is done by changing the index in
>>>> the 'sidetree'.
>>>>
>>>> Re-introduce the entry argument to mas_preallocate() so that a more
>>>> intelligent guess of the node count can be made.
>>>>
>>>> Patches are in the following order:
>>>> 0001-0002: Testing framework for benchmarking some operations
>>>> 0003-0004: Reduction of maple node allocation in sidetree
>>>> 0005:      Small cleanup of do_vmi_align_munmap()
>>>> 0006-0013: mas_preallocate() calculation change
>>>> 0014:      Change the vma iterator order
>>> I did run The AIM:page_test on an IceLake 48C/96T + 192G RAM platform with
>>> this patchset.
>>>
>>> The result has a little bit improvement:
>>> Base (next-20230602):
>>>    503880
>>> Base with this patchset:
>>>    519501
>>>
>>> But they are far from the none-regression result (commit 7be1c1a3c7b1):
>>>    718080
>>>
>>>
>>> Some other information I collected:
>>> With Base, the mas_alloc_nodes are always hit with request: 7.
>>> With this patchset, the request are 1 or 5.
>>>
>>> I suppose this is the reason for improvement from 503880 to 519501.
>>>
>>> With commit 7be1c1a3c7b1, mas_store_gfp() in do_brk_flags never triggered
>>> mas_alloc_nodes() call. Thanks.
>> Hi Fengwei,
>>
>> I think it may be related to the inaccurate number of nodes allocated
>> in the pre-allocation. I slightly modified the pre-allocation in this
>> patchset, but I don't know if it works. It would be great if you could
>> help test it, and help pinpoint the cause. Below is the diff, which can
>> be applied based on this pachset.
> I tried the patch, it could eliminate the call of mas_alloc_nodes() during
> the test. But the result of benchmark got a little bit improvement:
>   529040
> 
> But it's still much less than none-regression result. I will also double
> confirm the none-regression result.
Just noticed that the commit f5715584af95 make validate_mm() two implementation
based on CONFIG_DEBUG_VM instead of CONFIG_DEBUG_VM_MAPPLE_TREE). I have
CONFIG_DEBUG_VM but not CONFIG_DEBUG_VM_MAPPLE_TREE defined. So it's not an
apple to apple.


I disable CONFIG_DEBUG_VM and re-run the test and got:
Before preallocation change (7be1c1a3c7b1):
    770100
After preallocation change (28c5609fb236):
    680000
With liam's fix:
    702100
plus Peng's fix:
    725900


Regards
Yin, Fengwei

> 
> 
> Regards
> Yin, Fengwei
> 
>>
>> Thanks,
>> Peng
>>
>> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>> index 5ea211c3f186..e67bf2744384 100644
>> --- a/lib/maple_tree.c
>> +++ b/lib/maple_tree.c
>> @@ -5575,9 +5575,11 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
>>          goto ask_now;
>>      }
>>
>> -    /* New root needs a singe node */
>> -    if (unlikely(mte_is_root(mas->node)))
>> -        goto ask_now;
>> +    if ((node_size == wr_mas.node_end + 1 &&
>> +         mas->offset == wr_mas.node_end) ||
>> +        (node_size == wr_mas.node_end &&
>> +         wr_mas.offset_end - mas->offset == 1))
>> +        return 0;
>>
>>      /* Potential spanning rebalance collapsing a node, use worst-case */
>>      if (node_size  - 1 <= mt_min_slots[wr_mas.type])
>> @@ -5590,7 +5592,6 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
>>      if (likely(!mas_is_err(mas)))
>>          return 0;
>>
>> -    mas_set_alloc_req(mas, 0);
>>      ret = xa_err(mas->node);
>>      mas_reset(mas);
>>      mas_destroy(mas);
>>
>>
>>>
>>>
>>> Regards
>>> Yin, Fengwei
>>>
>>>>
>>>> [1] https://lore.kernel.org/linux-mm/202305061457.ac15990c-yujie.liu@intel.com/
>>>>
>>>> Liam R. Howlett (14):
>>>>    maple_tree: Add benchmarking for mas_for_each
>>>>    maple_tree: Add benchmarking for mas_prev()
>>>>    mm: Move unmap_vmas() declaration to internal header
>>>>    mm: Change do_vmi_align_munmap() side tree index
>>>>    mm: Remove prev check from do_vmi_align_munmap()
>>>>    maple_tree: Introduce __mas_set_range()
>>>>    mm: Remove re-walk from mmap_region()
>>>>    maple_tree: Re-introduce entry to mas_preallocate() arguments
>>>>    mm: Use vma_iter_clear_gfp() in nommu
>>>>    mm: Set up vma iterator for vma_iter_prealloc() calls
>>>>    maple_tree: Move mas_wr_end_piv() below mas_wr_extend_null()
>>>>    maple_tree: Update mas_preallocate() testing
>>>>    maple_tree: Refine mas_preallocate() node calculations
>>>>    mm/mmap: Change vma iteration order in do_vmi_align_munmap()
>>>>
>>>>   fs/exec.c                        |   1 +
>>>>   include/linux/maple_tree.h       |  23 ++++-
>>>>   include/linux/mm.h               |   4 -
>>>>   lib/maple_tree.c                 |  78 ++++++++++----
>>>>   lib/test_maple_tree.c            |  74 +++++++++++++
>>>>   mm/internal.h                    |  40 ++++++--
>>>>   mm/memory.c                      |  16 ++-
>>>>   mm/mmap.c                        | 171 ++++++++++++++++---------------
>>>>   mm/nommu.c                       |  45 ++++----
>>>>   tools/testing/radix-tree/maple.c |  59 ++++++-----
>>>>   10 files changed, 331 insertions(+), 180 deletions(-)
>>>>