On Wed, Sep 01, 2021 at 12:29:54AM -0400, Taylor Blau wrote: > > We _might_ also want to re-order the calls to write_idx_file() and > > write_rev_file() in its caller, given that simultaneous readers are > > happy to read our tmp_pack_* files. I guess the same might apply to the > > actual file write order pack-objects, too? I'm not sure if that's even > > possible, though; do we rely on side effects of generating the .idx when > > generating the other meta files? > > These are a little trickier. write_idx_file() is also responsible for > rearranging the objects array into index (name) order on the way out, > which write_rev_file() depends on in order to build up its mapping. > > So you could sort the array at the call-site before calling either > function, but it gets awkward since there are a handful of other callers > of write_idx_file() besides the two we're discussing. Yeah, I had in the back of my mind that there was some dependency there. I definitely prefer the "readers should not pick up tmp-packs" approach if that is workable. > > I think it might be more sensible if the reading side was taught to > > ignore ".tmp-*" and "tmp_*" (and possibly even ".*", though it's > > possible somebody is relying on that). > > ...this seems like the much-better way to go. Git shouldn't have to > worry about what order it writes the temporary files in, only what order > those temporary files are made non-temporary. > > But I need to do some more investigation to make sure there aren't any > other implications. So I'm happy to wait on that, or send a new version > of this series with a patch to fix the race in > builtin/index-pack.c:final(), too. I think if we kept it restricted to ".tmp-*" and "tmp_*", it should be pretty safe. The absolute worst case is that somebody trying to recover a corrupted repository might have to rename the files manually, I would think. Blocking ".*" is a harder sell. If we were starting from scratch, I'd probably do that, but now we don't know what weird things people might be doing. So unless there's a huge gain, it's hard to justify. (If we were starting from scratch, I'd actually probably insist they be named pack-$checksum.pack, etc, but it's much too late for that now). So anyway. I think we definitely want the index-pack.c change. We _could_ stop there and change the read side as a separate series, but I think that until we do, the ordering changes on the write side aren't really doing that much. They're shrinking the race a little, I guess, but it's still very much there. > (Unrelated to your suggestions above) another consideration for "stuff > to do later" would be to flip the default of pack.writeReverseIndex. I > had intentions to do that in the 2.32 cycle, but I forgot about it. Oh, yeah. We should definitely do that (in its own series). The .rev files have been a huge performance win, and I don't think there's any reason we wouldn't want to always use them. -Peff