Maple Tree 20201029

Axel Rasmussen axelrasmussen at google.com
Fri Oct 30 18:03:25 EDT 2020


On Fri, Oct 30, 2020 at 10:59 AM Liam R. Howlett
<Liam.Howlett at oracle.com> wrote:
>
> Hello,
>
> Please find the git repo of the maple tree at:
>  https://github.com/oracle/linux-uek/tree/maple/mainline
>   The current tag is howlett/maple/20201029

Thanks for the update, Liam!

I just kicked off a new set of A/B performance tests with this branch.
To work around the test issues last time, I rebased onto Linus' v5.9
tag first (that is, commit bbf5c979011a099af5dc76498918ed7df445635b).
All of your commits applied cleanly, I didn't need to touch anything
or resolve any conflicts.

Based upon last time, the full suite of tests will take ~9 hours or so
to complete. So, I'll send out the results tomorrow if all goes well.

>
>
> This code base is still based on 5.9.0-rc1.  I will be updating to whatever is
> current prior to any further development.
>
>
> The Changes from howlett/maple/20201009:
>  - Fixed brk regression caused by taking an unnecessary lock.  Performance is
>    around -16% slower than rbtree in my testing versus being a magnitude
>    (or more?) slower.
>  - Added mas_node_store(); this method only modifies a single node.  I consider
>    this the "medium speed" path, with mas_slot_store() being fast, and
>    split/spanning_store/rebalance being slow.
>  - Added bulk allocations of nodes.  This is especially useful when forking a
>    process.  Calling mas_entry_count() with the expected number of items will
>    allocate enough nodes to contain the data.
>  - Optimized forking of larger processes by creating a 'bulk store' method when
>    using the mas_entry_count().  This, along with its paired function
>    mas_destroy() will more optimally split nodes for bulk insertions.
>  - Removed 128 Byte node support.
>  - Left a space in each node for insertions when pushing data left/right.  This
>    reduces jitter seen when adding/removing a single entry and avoids the slow
>    path.
>  - Removed mas_offset()/mas_set_offset() in favour of directly setting a
>    variable in the maple state.  This is an artifact of the bulk allocation
>    needing to use the overloaded space in the maple state.
>
>
> The synthetic tests are almost all still slower.  We are identifying the
> problem areas and still optimizing on the maple tree side.  Please note that
> RCU is turned off and the mmap_sem usage is unmodified.
>
> The milestone I hit with howlett/maple/20201029 is a faster kernbench on a 144
> thread build.
>
> 5.9.0-rc1 rb tree:
>
> 21592.22user 2995.71system 3:38.52elapsed 11251%CPU (0avgtext+0avgdata 469212maxresident)k
> 15400inputs+17103904outputs (57804major+606382185minor)pagefaults 0swaps
> 21993.77user 3031.04system 3:37.24elapsed 11518%CPU (0avgtext+0avgdata 473396maxresident)k
> 0inputs+17103912outputs (109469major+606420309minor)pagefaults 0swaps
> 21950.81user 3028.40system 3:38.47elapsed 11433%CPU (0avgtext+0avgdata 471032maxresident)k
> 0inputs+17103904outputs (90825major+606505956minor)pagefaults 0swaps
>
> howlett/maple/20201029:
>
> 21816.37user 2949.97system 3:36.60elapsed 11433%CPU (0avgtext+0avgdata 471528maxresident)k
> 8inputs+17103912outputs (66169major+601199891minor)pagefaults 0swaps
> 22008.08user 2978.80system 3:37.07elapsed 11510%CPU (0avgtext+0avgdata 468712maxresident)k
> 8inputs+17103888outputs (36911major+601323942minor)pagefaults 0swaps
> 22050.22user 2978.20system 3:36.47elapsed 11561%CPU (0avgtext+0avgdata 469044maxresident)k
> 0inputs+17103912outputs (85798major+601162956minor)pagefaults 0swaps
>
> If anyone is willing to benchmark the code themselves, I'd be very
> interested in hearing what they find.
>
> Thank you,
> Liam



More information about the maple-tree mailing list