On 12/10/2021 8:39 PM, Taylor Blau wrote: > On Fri, Dec 10, 2021 at 05:31:27PM -0500, Taylor Blau wrote: >> I had originally imagined that storing the preferred pack's identity >> alone would be enough to solve this bug. But that isn't quite so, >> because we break ties among duplicate objects first by prefered-ness, >> then by their pack's mtime. So that could change too, and it would cause >> us to break in the same way. >> >> At the bare minimum you need an ordering of all of the packs in the >> MIDX (like I had originally imagined here). At most, we could do >> something like what is unintentionally written here, which would allow >> us to get rid of MIDX .rev files entirely. I think doing the former is >> simpler, and I am not sure if there are practical advantages to the >> latter. > > Thinking on it more, I don't think this "at minimum you would need..." > is quite right either. It would suffice to know the identity of the > preferred pack, and the mtimes of all of the other packs, since that > alone is enough to reconstruct the object order. > > That is pretty appealing, too, because knowing the order of packs would > require some major surgery (the order of packs isn't really something > the MIDX code thinks about, it's inferred from the way it sorts > objects). I think the root cause is that the object order can change when the preferred pack changes with the same set of pack-files. Suppose we added more complicated ways of deduplicating objects across the packs? Then whatever we include here based on preferred packs and mtimes would need to be updated to match. However, if we store the contents of the .rev file in the MIDX itself, then we don't need that extra layer of indirection. I'm leaning towards keeping the contents of the PORD chunk as-is, but renaming it to something like OORD (for object order). Then, we can carefully transition from using the .rev file to reading this chunk. We will want to continue looking for the .rev file when this chunk does not exist. Thanks, -Stolee