Re: [PATCH v2 23/24] pack-bitmap-write: relax unique rewalk condition

Jonathan Tan <jonathantanmy@xxxxxxxxxx> · Mon, 7 Dec 2020 10:19:09 -0800

> > > In an effort to discover a happy medium, this change reduces the walk
> > > for intermediate commits to only the first-parent history. This focuses
> > > the walk on how the histories converge, which still has significant
> > > reduction in repeat object walks. It is still possible to create
> > > quadratic behavior in this version, but it is probably less likely in
> > > realistic data shapes.
> >
> > Would this work? I agree that the width of the commit bitmasks would go
> > down (and there would also be fewer commit bitmasks generated, further
> > increasing the memory savings). But intuitively, if there is a commit
> > that is selected and only accessible through non-1st-parent links, then
> > any bitmaps generated for it cannot be contributed to its descendants
> > (since there was no descendant-to-ancestor walk that could reach it in
> > order to form the reverse edge).
> 
> s/bitmaps/bitmasks. 

I do mean bitmaps there - bitmasks are contributed to parents, but
bitmaps are contributed to descendants, if I remember correctly.

> We'll select commits independent of their first
> parent histories, and so in the situation that you're describing, if C
> reaches A only through non-1st-parent history, then A's bitmask will not
> contain the bits from C.

C is the descendant and A is the ancestor. Yes, A's bitmask will not
contain the bits from C.

> But when generating the reachability bitmap for C, we'll still find that
> we've generated a bitmap for A, and we can copy its bits directly. 

Here is my contention - this can happen only if there is a reverse edge
from A to C, as far as I can tell, but such a reverse edge has not been
formed.

> If
> this differs from an ancestor P that _is_ in the first-parent history,
> then P pushed its bits to C before calling fill_bitmap_commit() through
> the reverse edges.
> 
> > > Here is some data taken on a fresh clone of the kernel:
> > >
> > >              |   runtime (sec)    |   peak heap (GB)   |
> > >              |                    |                    |
> > >              |   from  |   with   |   from  |   with   |
> > >              | scratch | existing | scratch | existing |
> > >   -----------+---------+----------+---------+-----------
> > >     original |  64.044 |   83.241 |   2.088 |    2.194 |
> > >   last patch |  44.811 |   27.828 |   2.289 |    2.358 |
> > >   this patch | 100.641 |   35.560 |   2.152 |    2.224 |
> >
> > Hmm...the jump from 44 to 100 seems rather large.
> 
> Indeed. It's ameliorated a little bit in the later patches. We are
> over-walking some objects (as in we are walking them multiple times),
> but the return we get is reducing the peak heap usage from what it was
> in the last patch.
> 
> In the "unfathomably large" category, this makes things tractable.

Quoting from the next patch [1]:

>              |   runtime (sec)    |   peak heap (GB)   |
>              |                    |                    |
>              |   from  |   with   |   from  |   with   |
>              | scratch | existing | scratch | existing |
>   -----------+---------+----------+---------+-----------
>   last patch | 100.641 |   35.560 |   2.152 |    2.224 |
>   this patch |  99.720 |   11.696 |   2.152 |    2.217 |

That is true, but it is not ameliorated much :-(

If you have steps to generate these timings, I would like to try
comparing the performance between all patches and all-except-23.

[1] https://lore.kernel.org/git/42399a1c2e52e1d055a2d0ad96af2ca4dce6b1a0.1605649533.git.me@xxxxxxxxxxxx/