Re: [PATCH v3 00/30] pack-bitmap: pseudo-merge reachability bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 21, 2024 at 03:01:38PM -0400, Taylor Blau wrote:

> Here is another reroll my topic to introduce pseudo-merge bitmaps.

OK, I got through the whole thing. I left a few small comments, but
mostly just observations. Overall, the shape of it looks pretty good.
The much bigger question to me is: does it work?

The perf results you showed at the end are underwhelming, but I think
that's mostly because it's not an interesting repository. I think it
would be nice to see at least some point-in-time benchmarks on a
single repository.

But much more interesting to me is how it performs in the real world in
aggregate, over time:

  - how often / how much do pseudo-merges speed up queries in the real
    world. Clones/fetches, but also reachability queries. Could
    connectivity checks use bitmaps with this?

  - how often do pseudo-merge groups get invalidated by refs changing
    (and thus we lose the speedups from above)?

  - what's the cost like to generate them initially?

  - what's the cost like for subsequent repacks? Does the selection /
    grouping algorithm do a good job of keeping the older, larger groups
    stable (so that we can reuse them verbatim)?

I know you don't have those answers yet, and I know there's some
chicken-and-egg with getting this integrated so that you can start to
explore that. So I mostly reviewed this with an eye towards:

  - does the idea make sense (I think it does, but I'm kind of biased)

  - are the patches going to hurt anybody who isn't using the new
    feature (I think the answer is no)

  - does the on-disk representation seem right, since that is hard to
    change later (I didn't see any issues)

  - does the implementation look clean and plausibly correct (yes, but
    what I'm getting at is that I didn't pore over all of the new code
    with a microscope. Mostly I think the proof is in the pudding that
    it provides the same correct answers more quickly).

So to my eyes it looks good to move forward and let people start playing
with it. The big "experimental" warning in the config is good. Maybe
we'd want want in gitpacking(7), too?

I did wonder briefly what the backup plan is if there are problems with
the on-disk format (or worst case, if it turns out to be a dead end).
We've allocated a flag bit which I think we'd need to respect forever
(though as an optional extension, it's OK to understand and ignore it).
If we needed a "pseudo-merge bitmaps v2", how would that work? I think
we'd have to allocate another bit in the flags field.

I wonder if the start of the pseudo-merge section should have a 4-byte
version/flags field itself? I don't think that's something we've done
before, and maybe it's overkill. I dunno. It's just a lot easier to do
now than later.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux