On Wed, Sep 22, 2021 at 04:00:12PM -0700, Jonathan Tan wrote: > > This approach will cause a problem when multi-pack bitmaps are able to > > be generated from `git repack`, since the reference tips can change > > during the repack. Even though we ignore commits that don't exist in > > the MIDX (when doing a scan of the ref tips), it's possible that a > > commit in the MIDX reaches something that isn't. > > > > This can happen when a multi-pack index contains some pack which refers > > to loose objects (which by definition aren't included in the multi-pack > > index). > > > > By taking a snapshot of the references before we start repacking, we can > > close that race window. > > I can understand why we want the refs to remain the same both for the > MIDX generation and the MIDX bitmap generation (one reason that comes to > mind is how we select the commits for which to generate bitmaps), but I > don't understand what referring to loose objects has to do with it. I > think that using the same set of refs for MIDX generation and bitmap > generation is a good enough reason to do this, and we don't need to > mention loose objects. The point there is that a pack which contains objects that are ancestors of loose objects can show up at any time, including just before we select which packs to go into a MIDX. This is particularly common at GitHub, where all of our test-merges and objects created via the web interface are written loose. So we could compute a test merge, store the result loose, and have somebody push a pack up on top of it. If that pack shows up after the repack, but before we write a MIDX, then without the refs-snapshot code, we would include that pack, select its commits as candidates for bitmap selection, and then ultimately discover that the bitmap isn't closed, since the loose object won't be included. But that may be a little into the weeds for the patch message. Anyway, I did have to think about it for a minute, but I'm pretty sure that's what I was thinking when I wrote this. Thanks, Taylor