Re: [PATCH 0/2] pack-write,repack: prevent opening packs too early

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 31, 2021 at 10:05:46PM -0400, Taylor Blau wrote:

> This pair of patches fixes a race where the .idx is moved into place
> before the .rev file, allowing the pack to be in a state where it
> appears a .rev file wasn't generated.
> 
> This can cause Git to inadvertently take the slow "generate the
> reverse index on-the-fly", which does not impact correctness, but is
> unnecessarily slow when compared to reading the .rev file.
> 
> The race is fixed by moving the .idx into place only after all other
> pack-related files have already been written. The first patch fixes
> the direct `pack-objects` case, and the second patch fixes `repack`
> (which also renames pack files around).

These both look good to me, but I think we're missing one more spot.

The first patch covers git-pack-objects directly, like:

  git pack-objects .git/objects/pack/pack

In practice, though, we never do that. On-disk repacks happen via
git-repack, which always writes to temporary files. So I thought the
case in git-pack-objects wouldn't matter, but it does look like
prepare_packed_git() will actually read .tmp-$$-pack-1234abcd files, and
so might load our temporary pack. Unexpected, but we are covered by your
first patch. And then the second patch covers us as we move those
temporary files into place.

So far so good. But the other obvious way to get a pack idx is via
index-pack (especially "--stdin").

It looks like we'd want the same re-ordering to happen in
builtin/index-pack.c:final(), which is where we rename the temporary
files into place.

We _might_ also want to re-order the calls to write_idx_file() and
write_rev_file() in its caller, given that simultaneous readers are
happy to read our tmp_pack_* files. I guess the same might apply to the
actual file write order pack-objects, too? I'm not sure if that's even
possible, though; do we rely on side effects of generating the .idx when
generating the other meta files?

I think it might be more sensible if the reading side was taught to
ignore ".tmp-*" and "tmp_*" (and possibly even ".*", though it's
possible somebody is relying on that).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux