Re: [PATCH] fixup! midx: implement writing incremental MIDX bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 29, 2024 at 02:57:08PM -0400, Taylor Blau wrote:

> On Wed, Aug 28, 2024 at 02:33:56PM -0400, Jeff King wrote:
> > Is that right, though? It looks like the caller might pass in a
> > tempfile name like .../pack/multi-pack-index.d/tmp_midx_XXXXXX,
> > if we're in incremental mode. But we'll write directly to
> > "multi-pack-index-$hash.bitmap" in the same directory. I'm not sure to
> > what degree it matters, since that's the name we want in the long run.
> > But would we possibly overwrite an active-in-use file rather than doing
> > the atomic rename-into-place if we happened to generate the same midx?
> >
> > It feels like we should still respect the name the caller is using for
> > tempfiles, and then rename it into the correct spot at the end.
> 
> In either case, we're going to write to a temporary file initialized by
> the pack-bitmap machinery and then rename() it into place at the end of
> bitmap_writer_finish().

OK, that addresses my worry, if we're always writing to a tempfile (and
I verified with some recent stracing that this is the case). So renaming
that into tmp_midx_XXXXXX.bitmap would just be a pointless extra layer
of renames.

I do wonder if it's possible for us to generate a new different revindex
and bitmap pair for the same midx hash, and for a reader to see a
mismatched set for a moment. But that's an atomicity problem, and an
extra layer of renames is not going to solve that.

> On the caller side, in the non-incremental mode, we'll pass
> $GIT_DIR/objects/pack/multi-pack-index-$hash.bitmap as the location,
> write its contents into a temporary file, and then rename() it there.
> 
> But in the incremental mode this series introduces, I think it would be
> a bug to pass a tmp_midx_XXXXXX file path there, since nobody would move
> it from tmp_midx_XXXXX-$HASH.bitmap into its final location.
> 
> So I think what's written here with the fixup! patch is right (and
> should be squashed into 13/13 in the next round), but let me know if I'm
> missing something.

What confused me is that write_midx_reverse_index() _does_ still take
midx_name, and respects it. But I think that is a bug!

We do not usually even call that function, since modern midx's have a
RIDX chunk inside them instead of a separate file. But if you do this:

  # generate an extra pack
  git commit --allow-empty -m foo
  git repack -d

  # make an incremental midx with a .rev file; usually this ends up
  # as a RIDX chunk, so we have to force it.
  GIT_TEST_MIDX_WRITE_REV=1 git multi-pack-index write --incremental --bitmap

then you'll end up with a tmp_midx_XXXXXX-*.rev file leftover in
multi-pack-index.d (since, as you note, nobody is moving those into
place).

So probably write_midx_reverse_index() needs the same treatment to
derive its own filenames for the incremental case, and to drop the
midx_name parameter.

Or I wonder if we could simply drop the code to write a separate .rev
file entirely? I don't think there's a reason anybody would want it.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux