On Tue, May 24, 2022 at 03:03:11PM -0700, Junio C Hamano wrote: > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > Calling `is_pack_valid()` early on makes it substantially less likely > > that we will have to deal with a pack going away, since we'll have an > > open file descriptor on its contents much earlier. > > Sorry for asking a stupid question (or two), but I am confused. No such thing as a stupid question, so your apology is not necessary in the slightest :). > This does make sure that we can read and use the contents of the > packfile even when somebody else removes it from the disk by > ensuring that > > (1) we have an open file descriptor to it, so that we could open > mmap window into it at will; or > > (2) we have a mmap window that covers all of it (this should be the > norm on platforms with vast address space); or > > (3) we are in the same state as (1) by opening the packfile to > validate the pack right now. > > and during the pack-object we are running (aka "repack"), we can > continue to read from that pack that may have already disappeared > from the disk. > > But is that sufficient? Are we writing the resulting new pack(s) > out in such a way that the repository is healthy without the pack > we noticed is disappearing? How do we ensure that? It's sufficient in the sense that we're writing out all of the objects we were asked to (from pack-objects's perspective). Of course, if the "simultaneous writer" is just removing packs right after they are opened, that will produce an obviously-broken state. But assuming that repack isn't removing objects it shouldn't (which I think is a safe assumption from pack-objects' perspective, since all it cares about is writing packs that contain the desired set of objects), then we are OK. Thanks, Taylor