On 2/24/2022 6:01 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > >> From: Derrick Stolee <derrickstolee@xxxxxxxxxx> >> >> The commit-graph file format v2 alters how it stores the corrected >> commit date offsets within the Commit Data chunk instead of a separate >> chunk. The idea is to significantly reduce the amount of data loaded >> from disk while parsing the commit-graph. >> >> We need to alter the error message when we see a file format version >> outside of our range now that multiple are possible. This has a >> non-functional side-effect of altering a use of GRAPH_VERSION within >> write_commit_graph(). >> >> By storing the file format version in 'struct commit_graph', we can >> alter the parsing code to depend on that version value. This involves >> changing where we look for the corrected commit date offset, but also >> which constants we use for jumping into the Generation Data Overflow >> chunk. The Commit Data chunk only has 30 bits available for the offset >> while the Generation Data chunk has 32 bits. This only makes a >> meaningful difference in very malformed repositories. >> >> Also, we need to be careful about how we enable using corrected commit >> dates and generation numbers to rely upon the read_generation_data value >> instead of a non-zero value in the Commit Date chunk. In >> generation_numbers_enabled(), the first_generation variable is >> attemptint to look for the first topological level stored to see that it >> is nonzero. However, for a v2 commit-graph, this value is actually >> likely to be zero because the corrected commit date offset is probably >> zero. > > I see references to OVERFLOW_V3 that comes after OVERFLOW, but there > is no OVERFLOW_V2. Intended, or should it be V2 to match the "file > format v2" of "generation number v2"? It is getting awkward to have > two different version scheme ("gen v2" means corrected committer > timestamp, whose on-disk representation is different between "file > v1" and "file v2", and this OVERFLOW vs OVERFLOW_V3 is about the > difference between "file v1" and "file v2" if I am following the > series correctly). You're right that it would be clearer to rename OVERFLOW to OVERFLOW_V2. I'll add that to my next version when these patches appear on their own. Thanks, -Stolee