Re: [PATCH 3/4] maple_tree: use the correct min to calculate split

Wei Yang <richard.weiyang@xxxxxxxxx> · Sat, 9 Nov 2024 12:22:11 +0000

On Fri, Nov 08, 2024 at 11:01:21PM -0500, Liam R. Howlett wrote:
>* Wei Yang <richard.weiyang@xxxxxxxxx> [241108 20:40]:
>> On Thu, Nov 07, 2024 at 10:19:37PM -0500, Liam R. Howlett wrote:
>> >* Wei Yang <richard.weiyang@xxxxxxxxx> [241107 21:55]:
>> >> On Sun, Oct 20, 2024 at 06:00:35PM -0400, Liam R. Howlett wrote:
>> >> >* Wei Yang <richard.weiyang@xxxxxxxxx> [241019 22:46]:
>> >> >> The check in mab_calc_split() is to make sure the gap between
>> >> >> [0, split] won't be too small. But we don't pass the correct min.
>> >> >> 
>> >> >> First we may encounter a pivot[split] smaller than min. For example:
>> >> >> 
>> >> >>        mt_init_flags(mt, 0);
>> >> >>        for (count = 0; count <= 240; count++) {
>> >> >>                mas_set(&mas, count);
>> >> >>                mas_store(&mas, xa_mk_value(count));
>> >> >>        }
>> >> >> 
>> >> >> On splitting root for storing 240, the pivot[split] is smaller than min.
>> >> >> This result a huge (pivot[split] - min).
>> >> >
>> >> >This is fine.
>> >> >
>> >> >There is an open work item to make it more accurate at higher levels,
>> >> >but it's not a problem as it is.
>> >> >
>> >> >Each level upwards needs a better 'minimum span', meaning that the node
>> >> >should have at least mas.min - mas.min + level * something.
>> >> 
>> >> Don't follow this. 
>> >> 
>> >> First, I guess it is mas.max - mas.min?
>> >> 
>> >
>> >No, the idea is that we could do better if we could keep the data more
>> >densely packed.
>> >
>> >If we have singletons (all range of 1), and we have 10,000 singletons,
>> >then a parent node with 0 - 128 and a sibling of 129-257 makes no sense,
>> >there is nothing that can be inserting into the parents last two slots
>> >- we can't split singletons smaller so we should have done a better job
>> >at splitting.
>> >
>> >If you think of how this grows today, we split then rebalance until the
>> >left node is full and we have to split again.  This could be better
>> >optimised to avoid multiple rebalances by doing something to push more
>> >data to the left based on the span of the children and the nodes.  But
>> >we'd want to keep enough so some modifications to each node doesn't
>> >cause a rebalance in the opposite direction.
>> >
>> >So a minimum span of parent nodes one level up from the leaves would be
>> >16 * 10 = 160, so when we split we can push more to the left to try and
>> >get to 160 while keeping the right node sufficient.  The math of the
>> >span would change based on how far up the tree we go, but would be
>> >easier to maintain.
>> >
>> >I'd probably still want to split it 50/50 the first time, but it might
>> >make sense to push more on the next rebalance.  so 10 => 6|5, 6|10 =>
>> >9|5, but maybe it's not worth the effort.  I'm starting to think so,
>> >considering it's been ineffective to date.  There is a chance that the
>> >active area gets moved to the more-full node and ends up splitting
>> >sooner anyways..
>> >
>> >All of this will probably change when dense nodes come anyways.
>> >
>> >> Then there is sth related to level. But I don't see level is used for
>> >> calculation. Would you mind sharing more your original thought?
>> >> 
>> >> >
>> >> >It works today for leaves, somewhat.
>> >> >
>> >> 
>> >> To me it works by coincidence.
>> >
>> >Seems like it.
>> >
>> >> 
>> >> The condition check is:
>> >> 
>> >>   (bn->pivot[split] - min) < (slot_count - 1) 
>> >> 
>> >> while slot_count is a constent, 16 for leaf node.
>> >> 
>> >> And we always pass 0 as min, the condition for leaf is:
>> >> 
>> >>   bn->pivot[split] < 15
>> >> 
>> >> For the mmap world, per my understanding, we won't expect an address is smaller
>> >> than 15.
>> >
>> >There are other users, but I think we should just drop this attempt at
>> >fancy stuff.
>> >
>> 
>> You mean "set mid_split = 0 and drop the min argument all together" mentioned
>> in another mail?
>
>Yes, please.  The more I think about this, the more I think we're
>wasting time trying to predict the future.  And, although fun, it
>usually doesn't work out.
>

Agree, will prepare another version with this.

Thanks

-- 
Wei Yang
Help you, Help me