Maple Tree 20201029

Liam R. Howlett Liam.Howlett at Oracle.com
Fri Oct 30 13:57:06 EDT 2020


Hello,

Please find the git repo of the maple tree at:
 https://github.com/oracle/linux-uek/tree/maple/mainline
  The current tag is howlett/maple/20201029


This code base is still based on 5.9.0-rc1.  I will be updating to whatever is
current prior to any further development.


The Changes from howlett/maple/20201009:
 - Fixed brk regression caused by taking an unnecessary lock.  Performance is
   around -16% slower than rbtree in my testing versus being a magnitude
   (or more?) slower.
 - Added mas_node_store(); this method only modifies a single node.  I consider
   this the "medium speed" path, with mas_slot_store() being fast, and
   split/spanning_store/rebalance being slow.
 - Added bulk allocations of nodes.  This is especially useful when forking a
   process.  Calling mas_entry_count() with the expected number of items will
   allocate enough nodes to contain the data.
 - Optimized forking of larger processes by creating a 'bulk store' method when
   using the mas_entry_count().  This, along with its paired function
   mas_destroy() will more optimally split nodes for bulk insertions.
 - Removed 128 Byte node support.
 - Left a space in each node for insertions when pushing data left/right.  This
   reduces jitter seen when adding/removing a single entry and avoids the slow
   path.
 - Removed mas_offset()/mas_set_offset() in favour of directly setting a
   variable in the maple state.  This is an artifact of the bulk allocation
   needing to use the overloaded space in the maple state.


The synthetic tests are almost all still slower.  We are identifying the
problem areas and still optimizing on the maple tree side.  Please note that
RCU is turned off and the mmap_sem usage is unmodified.

The milestone I hit with howlett/maple/20201029 is a faster kernbench on a 144
thread build.

5.9.0-rc1 rb tree:

21592.22user 2995.71system 3:38.52elapsed 11251%CPU (0avgtext+0avgdata 469212maxresident)k
15400inputs+17103904outputs (57804major+606382185minor)pagefaults 0swaps
21993.77user 3031.04system 3:37.24elapsed 11518%CPU (0avgtext+0avgdata 473396maxresident)k
0inputs+17103912outputs (109469major+606420309minor)pagefaults 0swaps
21950.81user 3028.40system 3:38.47elapsed 11433%CPU (0avgtext+0avgdata 471032maxresident)k
0inputs+17103904outputs (90825major+606505956minor)pagefaults 0swaps

howlett/maple/20201029:

21816.37user 2949.97system 3:36.60elapsed 11433%CPU (0avgtext+0avgdata 471528maxresident)k
8inputs+17103912outputs (66169major+601199891minor)pagefaults 0swaps
22008.08user 2978.80system 3:37.07elapsed 11510%CPU (0avgtext+0avgdata 468712maxresident)k
8inputs+17103888outputs (36911major+601323942minor)pagefaults 0swaps
22050.22user 2978.20system 3:36.47elapsed 11561%CPU (0avgtext+0avgdata 469044maxresident)k
0inputs+17103912outputs (85798major+601162956minor)pagefaults 0swaps

If anyone is willing to benchmark the code themselves, I'd be very
interested in hearing what they find.

Thank you,
Liam



More information about the maple-tree mailing list