> However, there was a (significant) drawback: wide histories with many > refs had an explosion of memory costs to compute the commit bitmasks > during the exploration that discovers these intermediate commits. Since > these wide histories are unlikely to repeat walking objects, the benefit > of walking objects multiple times was not expensive before. But now, the > commit walk *before computing bitmaps* is incredibly expensive. Do you have numbers of how large the commit bitmasks are? > In an effort to discover a happy medium, this change reduces the walk > for intermediate commits to only the first-parent history. This focuses > the walk on how the histories converge, which still has significant > reduction in repeat object walks. It is still possible to create > quadratic behavior in this version, but it is probably less likely in > realistic data shapes. Would this work? I agree that the width of the commit bitmasks would go down (and there would also be fewer commit bitmasks generated, further increasing the memory savings). But intuitively, if there is a commit that is selected and only accessible through non-1st-parent links, then any bitmaps generated for it cannot be contributed to its descendants (since there was no descendant-to-ancestor walk that could reach it in order to form the reverse edge). > Here is some data taken on a fresh clone of the kernel: > > | runtime (sec) | peak heap (GB) | > | | | > | from | with | from | with | > | scratch | existing | scratch | existing | > -----------+---------+----------+---------+----------- > original | 64.044 | 83.241 | 2.088 | 2.194 | > last patch | 44.811 | 27.828 | 2.289 | 2.358 | > this patch | 100.641 | 35.560 | 2.152 | 2.224 | Hmm...the jump from 44 to 100 seems rather large.