Re: [PATCH 6/7] commit-graph: parse file format v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/24/2022 6:01 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
>> From: Derrick Stolee <derrickstolee@xxxxxxxxxx>
>>
>> The commit-graph file format v2 alters how it stores the corrected
>> commit date offsets within the Commit Data chunk instead of a separate
>> chunk. The idea is to significantly reduce the amount of data loaded
>> from disk while parsing the commit-graph.
>>
>> We need to alter the error message when we see a file format version
>> outside of our range now that multiple are possible. This has a
>> non-functional side-effect of altering a use of GRAPH_VERSION within
>> write_commit_graph().
>>
>> By storing the file format version in 'struct commit_graph', we can
>> alter the parsing code to depend on that version value. This involves
>> changing where we look for the corrected commit date offset, but also
>> which constants we use for jumping into the Generation Data Overflow
>> chunk. The Commit Data chunk only has 30 bits available for the offset
>> while the Generation Data chunk has 32 bits. This only makes a
>> meaningful difference in very malformed repositories.
>>
>> Also, we need to be careful about how we enable using corrected commit
>> dates and generation numbers to rely upon the read_generation_data value
>> instead of a non-zero value in the Commit Date chunk. In
>> generation_numbers_enabled(), the first_generation variable is
>> attemptint to look for the first topological level stored to see that it
>> is nonzero. However, for a v2 commit-graph, this value is actually
>> likely to be zero because the corrected commit date offset is probably
>> zero.
> 
> I see references to OVERFLOW_V3 that comes after OVERFLOW, but there
> is no OVERFLOW_V2.  Intended, or should it be V2 to match the "file
> format v2" of "generation number v2"?  It is getting awkward to have
> two different version scheme ("gen v2" means corrected committer
> timestamp, whose on-disk representation is different between "file
> v1" and "file v2", and this OVERFLOW vs OVERFLOW_V3 is about the
> difference between "file v1" and "file v2" if I am following the
> series correctly).

You're right that it would be clearer to rename OVERFLOW to
OVERFLOW_V2. I'll add that to my next version when these patches
appear on their own.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux