Re: [PATCH v3 05/11] commit-graph: return 64-bit generation number

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> writes:
> On Fri, Aug 21, 2020 at 03:14:34PM +0200, Jakub Narębski wrote:
>> "Abhishek Kumar via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>>
>>> From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
>>>
>>> In a preparatory step, let's return timestamp_t values from
>>> commit_graph_generation(), use timestamp_t for local variables
>>
>> All right, this is all good.
>>
>>> and define GENERATION_NUMBER_INFINITY as (2 ^ 63 - 1) instead.
>>
>> This needs more detailed examination.  There are two similar constants,
>> GENERATION_NUMBER_INFINITY and GENERATION_NUMBER_MAX.  The former is
>> used for newest commits outside the commit-graph, while the latter is
>> maximum number that commits in the commit-graph can have (because of the
>> storage limitations).  We therefore need GENERATION_NUMBER_INFINITY
>> to be larger than GENERATION_NUMBER_MAX, and it is (and was).
>>
>> The GENERATION_NUMBER_INFINITY is because of the above requirement
>> traditionally taken as maximum value that can be represented in the data
>> type used to store commit's generation number _in memory_, but it can be
>> less.  For timestamp_t the maximum value that can be represented
>> is (2 ^ 63 - 1).
>>
>> All right then.
>
> Related to this, by the end of this series we are using
> GENERATION_NUMBER_MAX in just one place - compute_generation_numbers()
> to make sure the topological levels fit within 30 bits.
>
> Would it be more appropriate to rename GENERATION_NUMBER_MAX to
> GENERATION_NUMBER_V1_MAX (along the lines of
> GENERATION_NUMBER_V2_OFFSET_MAX)  to correctly describe that is a
> limit on topological levels, rather than generation number value?

Yes, I think that at the end of this patch series we should be using
GENERATION_NUMBER_V1_MAX and GENERATION_NUMBER_V2_OFFSET_MAX to describe
storage limits, and GENERATION_NUMBER_INFINITY (the latter as generation
number value for commits not in graph).

We need to ensure that both GENERATION_NUMBER_V1_MAX and
GENERATION_NUMBER_V2_OFFSET_MAX are smaller than
GENERATION_NUMBER_INFINITY.


However, as I wrote, handling GENERATION_NUMBER_V2_OFFSET_MAX is
difficult.  As far as I can see, we can choose one of the *three*
solutions (the third one is _new_):

a. store 64-bit corrected commit date in the GDAT chunk
   all possible values are able to be stored, no need for
   GENERATION_NUMBER_V2_MAX,

b. store 32-bit corrected commit date offset in the GDAT chunk,
   if its value is larger than GENERATION_NUMBER_V2_OFFSET_MAX,
   do not write GDAT chunk at all (like for backward compatibility
   with mixed-version chains of split commit-graph layers),

c. store 32-bit corrected commit date offset in the GDAT chunk,
   using some kind of overflow handling scheme; for example if
   the most significant bit of 32-bit value is 1, then the
   rest 31-bits are position in GDOV chunk, which uses 64-bit
   to store those corrected commit date offsets that do not
   fit in 32 bits.

This type of schema is used in other places in Git code, if I remember
it correctly.

>> The commit message says nothing about the new symbolic constant
>> GENERATION_NUMBER_V1_INFINITY, though.
>>
>> I'm not sure it is even needed (see comments below).
>
> Yes, you are correct. I tried it out with your suggestions and it wasn't
> really needed.
>
> Thanks for catching this!

Mistakes can happen when changig how the series is split into commits.

Best,
-- 
Jakub Narębski




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux