Re: [PATCH v3 06/11] commit-graph: add a slab to store topological levels

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> writes:
> On Fri, Aug 21, 2020 at 08:43:38PM +0200, Jakub Narębski wrote:
>> "Abhishek Kumar via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>>
>>> From: Abhishek Kumar <abhishekkumar8222@xxxxxxxxx>
[...]
>>> @@ -1335,11 +1341,11 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>>>  					_("Computing commit graph generation numbers"),
>>>  					ctx->commits.nr);
>>>  	for (i = 0; i < ctx->commits.nr; i++) {
>>> -		uint32_t generation = commit_graph_data_at(ctx->commits.list[i])->generation;
>>> +		uint32_t level = *topo_level_slab_at(ctx->topo_levels, ctx->commits.list[i]);
>>
>> All right, so that is why this 'generation' variable was not converted
>> to timestamp_t type.
>>
>>>
>>>  		display_progress(ctx->progress, i + 1);
>>> -		if (generation != GENERATION_NUMBER_V1_INFINITY &&
>>> -		    generation != GENERATION_NUMBER_ZERO)
>>> +		if (level != GENERATION_NUMBER_V1_INFINITY &&
>>> +		    level != GENERATION_NUMBER_ZERO)
>>>  			continue;
>>
>> Here we use GENERATION_NUMBER*_INFINITY to check if the commit is
>> outside commit-graph files, and therefore we would need its topological
>> level computed.
>>
>> However, I don't understand how it works.  We have had created the
>> commit_graph_data_at() and use it instead of commit_graph_data_slab_at()
>> to provide default values for `struct commit_graph`... but only for
>> `graph_pos` member.  It is commit_graph_generation() that returns
>> GENERATION_NUMBER_INFINITY for commits not in graph.
>>
>> But neither commit_graph_data_at()->generation nor topo_level_slab_at()
>> handles this special case, so I don't see how 'generation' variable can
>> *ever* be GENERATION_NUMBER_INFINITY, and 'level' variable can ever be
>> GENERATION_NUMBER_V1_INFINITY for commits not in graph.
>>
>> Does it work *accidentally*, because the default value for uninitialized
>> data on commit-slab is 0, which matches GENERATION_NUMBER_ZERO?  It
>> certainly looks like it does.  And GENERATION_NUMBER_ZERO is an artifact
>> of commit-graph feature development history, namely the short time where
>> Git didn't use any generation numbers and stored 0 in the place set for
>> it in the commit-graph format...  On the other hand this is not the case
>> for corrected commit date (generation number v2), as it could
>> "legitimately" be 0 if some root commit (without any parents) had
>> committerdate of epoch 0, i.e. 1 January 1970 00:00:00 UTC, perhaps
>> caused by malformed but valid commit object.
>>
>> Ugh...
>
> It works accidentally.
>
> Our decision to avoid the cost of initializing both
> commit_graph_data->generation and commit_graph_data->graph_pos has
> led to some unwieldy choices - the complexity of helper functions,
> bypassing helper functions when writing a commit-graph file [1].
>
> I want to re-visit how commit_graph_data slab is defined in a future series.
>
> [1]: https://lore.kernel.org/git/be28ab7b-0ae4-2cc5-7f2b-92075de3723a@xxxxxxxxx/

All right, we might want to make use of the fact that the value of 0 for
topological level here always mean that its value for a commit needs to
be computed, that 0 is not a valid value for topological levels.
- if the value 0 came from commit-graph file, it means that it came
  from Git version that used commit-graph but didn't compute generation
  numbers; the value is GENERATION_NUMBER_ZERO
- the value 0 might came from the fact that commit is not in graph,
  and that commit-slab zero-initializes the values stored; let's
  call this value GENERATION_NUMBER_UNINITIALIZED

If we ensure that corrected commit date can never be zero (which is
extremely unlikely, as one of root commits would have to be malformed or
written on badly misconfigured computer, with value of 0 for committer
timestamp), then this "happy accident" can keep working.

  As a special case, commit date with timestamp of zero (01.01.1970 00:00:00Z)
  has corrected commit date of one, to be able to distinguish
  uninitialized values.

Or something like that.

Actually, it is not even necessary, as corrected commit date of 0 just
means that this single value (well, for every root commit with commit
date of 0) would be unnecessary recomputed in compute_generation_numbers().

Anyway, we would want to document this fact in the commit message.

Best,
-- 
Jakub Narębski




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux