* Wei Yang <richard.weiyang@xxxxxxxxx> [241108 20:40]: > On Thu, Nov 07, 2024 at 10:19:37PM -0500, Liam R. Howlett wrote: > >* Wei Yang <richard.weiyang@xxxxxxxxx> [241107 21:55]: > >> On Sun, Oct 20, 2024 at 06:00:35PM -0400, Liam R. Howlett wrote: > >> >* Wei Yang <richard.weiyang@xxxxxxxxx> [241019 22:46]: > >> >> The check in mab_calc_split() is to make sure the gap between > >> >> [0, split] won't be too small. But we don't pass the correct min. > >> >> > >> >> First we may encounter a pivot[split] smaller than min. For example: > >> >> > >> >> mt_init_flags(mt, 0); > >> >> for (count = 0; count <= 240; count++) { > >> >> mas_set(&mas, count); > >> >> mas_store(&mas, xa_mk_value(count)); > >> >> } > >> >> > >> >> On splitting root for storing 240, the pivot[split] is smaller than min. > >> >> This result a huge (pivot[split] - min). > >> > > >> >This is fine. > >> > > >> >There is an open work item to make it more accurate at higher levels, > >> >but it's not a problem as it is. > >> > > >> >Each level upwards needs a better 'minimum span', meaning that the node > >> >should have at least mas.min - mas.min + level * something. > >> > >> Don't follow this. > >> > >> First, I guess it is mas.max - mas.min? > >> > > > >No, the idea is that we could do better if we could keep the data more > >densely packed. > > > >If we have singletons (all range of 1), and we have 10,000 singletons, > >then a parent node with 0 - 128 and a sibling of 129-257 makes no sense, > >there is nothing that can be inserting into the parents last two slots > >- we can't split singletons smaller so we should have done a better job > >at splitting. > > > >If you think of how this grows today, we split then rebalance until the > >left node is full and we have to split again. This could be better > >optimised to avoid multiple rebalances by doing something to push more > >data to the left based on the span of the children and the nodes. But > >we'd want to keep enough so some modifications to each node doesn't > >cause a rebalance in the opposite direction. > > > >So a minimum span of parent nodes one level up from the leaves would be > >16 * 10 = 160, so when we split we can push more to the left to try and > >get to 160 while keeping the right node sufficient. The math of the > >span would change based on how far up the tree we go, but would be > >easier to maintain. > > > >I'd probably still want to split it 50/50 the first time, but it might > >make sense to push more on the next rebalance. so 10 => 6|5, 6|10 => > >9|5, but maybe it's not worth the effort. I'm starting to think so, > >considering it's been ineffective to date. There is a chance that the > >active area gets moved to the more-full node and ends up splitting > >sooner anyways.. > > > >All of this will probably change when dense nodes come anyways. > > > >> Then there is sth related to level. But I don't see level is used for > >> calculation. Would you mind sharing more your original thought? > >> > >> > > >> >It works today for leaves, somewhat. > >> > > >> > >> To me it works by coincidence. > > > >Seems like it. > > > >> > >> The condition check is: > >> > >> (bn->pivot[split] - min) < (slot_count - 1) > >> > >> while slot_count is a constent, 16 for leaf node. > >> > >> And we always pass 0 as min, the condition for leaf is: > >> > >> bn->pivot[split] < 15 > >> > >> For the mmap world, per my understanding, we won't expect an address is smaller > >> than 15. > > > >There are other users, but I think we should just drop this attempt at > >fancy stuff. > > > > You mean "set mid_split = 0 and drop the min argument all together" mentioned > in another mail? Yes, please. The more I think about this, the more I think we're wasting time trying to predict the future. And, although fun, it usually doesn't work out. > > >> > >> >> > >> >> Second prev_l_mas.min is not initialized for the first iteration. This > >> >> means on splitting leaf node, this value is mostly taking no effect. > >> > > >> >No, it is set to 0. Not initialized would cause random data loss. > >> > > >> >See MA_STATE() in the header. > >> > > >> > >> Right, I want to say not initialized to the value we want. > >> > >> Will be more careful next time. > >> > >> >> > >> >> Since we are now calculating the split of mas->node, we should use the > >> >> mas->min instead of prev_l_mas.min. > >> > > >> >This sounds reasonable, but considering what this number is used for, I > >> >don't see how it is working as well as it is today. I will need to look > >> >deeper into this. > >> > > >> >> > >> >> Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx> > >> >> CC: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > >> >> CC: Sidhartha Kumar <sidhartha.kumar@xxxxxxxxxx> > >> >> CC: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > >> >> --- > >> >> lib/maple_tree.c | 2 +- > >> >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> >> > >> >> diff --git a/lib/maple_tree.c b/lib/maple_tree.c > >> >> index 894dc5e9436e..c2d4b188646c 100644 > >> >> --- a/lib/maple_tree.c > >> >> +++ b/lib/maple_tree.c > >> >> @@ -3357,7 +3357,7 @@ static void mas_split(struct ma_state *mas, struct maple_big_node *b_node) > >> >> if (mas_push_data(mas, height, &mast, false)) > >> >> break; > >> >> > >> >> - split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min); > >> >> + split = mab_calc_split(mas, b_node, &mid_split, mas->min); > >> >> mast_split_data(&mast, mas, split); > >> >> /* > >> >> * Usually correct, mab_mas_cp in the above call overwrites > >> >> -- > >> >> 2.34.1 > >> >> > >> > >> -- > >> Wei Yang > >> Help you, Help me > > -- > Wei Yang > Help you, Help me