Re: [PATCH v3 06/11] commit-graph: add a slab to store topological levels

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Abhishek Kumar <abhishekkumar8222@xxxxxxxxx> writes:
> On Tue, Aug 25, 2020 at 09:56:44AM +0200, Jakub Narębski wrote:
>> On Tue, 25 Aug 2020 at 09:33, Jakub Narębski <jnareb@xxxxxxxxx> wrote:
>>
>> ...
>>
>>>
>>> All right, we might want to make use of the fact that the value of 0 for
>>> topological level here always mean that its value for a commit needs to
>>> be computed, that 0 is not a valid value for topological levels.
>>> - if the value 0 came from commit-graph file, it means that it came
>>>   from Git version that used commit-graph but didn't compute generation
>>>   numbers; the value is GENERATION_NUMBER_ZERO
>>> - the value 0 might came from the fact that commit is not in graph,
>>>   and that commit-slab zero-initializes the values stored; let's
>>>   call this value GENERATION_NUMBER_UNINITIALIZED
>>>
>>> If we ensure that corrected commit date can never be zero (which is
>>> extremely unlikely, as one of root commits would have to be malformed or
>>> written on badly misconfigured computer, with value of 0 for committer
>>> timestamp), then this "happy accident" can keep working.
>>>
>>>   As a special case, commit date with timestamp of zero (01.01.1970 00:00:00Z)
>>>   has corrected commit date of one, to be able to distinguish
>>>   uninitialized values.
>>>
>>> Or something like that.
>>>
>>> Actually, it is not even necessary, as corrected commit date of 0 just
>>> means that this single value (well, for every root commit with commit
>>> date of 0) would be unnecessary recomputed in compute_generation_numbers().
>>>
>>> Anyway, we would want to document this fact in the commit message.
>> 
>> Alternatively, instead of comparing 'level' (and later in series also
>> 'corrected_commit_date') against GENERATION_NUMBER_INFINITY,
>> we could load at no extra cost `graph_pos` value and compare it
>> against COMMIT_NOT_FROM_GRAPH.
>> 
>> But with this solution we could never get rid of graph_pos, if we
>> think it is unnecessary. If we split commit_graph_data into separate
>> slabs (as it was in early versions of respective patch series), we
>> would have to pay additional cost.
>> 
>> But it is an alternative.
>
> I think updating a commit date with timestampt of zero to use corrected
> commit date of one would leave us more options down the line.
>
> Changing this is easy enough.
>
> For a root commit with timestamp zero, current->date would be zero and 
> max_corrected_commit_date would be zero as well. So we can set 
> corrected commit date as `max_corrected_commit_date + 1`, instead of the
> earlier `(current->date - 1) + 1`.
>
> ----
>
> diff --git a/commit-graph.c b/commit-graph.c
> index 7ed0a33ad6..e3c5e30405 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -1389,7 +1389,7 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>  					max_level = GENERATION_NUMBER_V1_MAX - 1;
>  				*topo_level_slab_at(ctx->topo_levels, current) = max_level + 1;
>  
> -				if (current->date > max_corrected_commit_date)
> +				if (current->date && current->date > max_corrected_commit_date)
>  					max_corrected_commit_date = current->date - 1;
>  				commit_graph_data_at(current)->generation = max_corrected_commit_date + 1;
>  			}

It turned out to be much easier than I have expected: a one-line change,
adding simply a new condition.  Good work!

Perhaps it would be better to write it as current->date == GENERATION_NUMBER_UNINITIALIZED
(or *_ZERO, or *_NO_DATA,...), but current version is quite idiomatic
and easy to read.

With this change we should, of course, also change the commit-graph
format docs.


On the other hand it is a bit unnecessary.  If `generation` is zero,
using it would still work, and it would just mean that it would be
unnecessarily recomputed - but corrected commit date equal zero is
possible only for root commits.

But the above solution is more consistent, using 0 to mark not
initialized values...  it is cleaner, at the cost of one more corner
case, single line change, and possibly an insignificant amount of
performance penalty due to adding unlikely true branch to the
conditional.

Best,
-- 
Jakub Narębski




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux