Re: [PATCH v3 7/9] commit: add short-circuit to paint_down_to_common()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Derrick Stolee <stolee@xxxxxxxxx> writes:

> On 4/18/2018 7:19 PM, Jakub Narebski wrote:
>> Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:
>>
> [...]
>>> [...], and this saves time during 'git branch --contains' queries
>>> that would otherwise walk "around" the commit we are inspecting.
>>>
>> If I understand the code properly, what happens is that we can now
>> short-circuit if all commits that are left are lower than the target
>> commit.
>>
>> This is because max-order priority queue is used: if the commit with
>> maximum generation number is below generation number of target commit,
>> then target commit is not reachable from any commit in the priority
>> queue (all of which has generation number less or equal than the commit
>> at head of queue, i.e. all are same level or deeper); compare what I
>> have written in [1]
>>
>> [1]: https://public-inbox.org/git/866052dkju.fsf@xxxxxxxxx/
>>
>> Do I have that right?  If so, it looks all right to me.
>
> Yes, the priority queue needs to compare via generation number first
> or there will be errors. This is why we could not use commit time
> before.

I was more concerned about getting right the order in the priority queue
(does it return minimal or maximal generation number).

I understand that the cutoff could not be used without generation
numbers because of the possibility of clock skew - using cutoff on dates
could lead to wrong results.

>>> For a copy of the Linux repository, where HEAD is checked out at
>>> v4.13~100, we get the following performance improvement for
>>> 'git branch --contains' over the previous commit:
>>>
>>> Before: 0.21s
>>> After:  0.13s
>>> Rel %: -38%
>> [...]
>>>   		flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
>>>   		if (flags == (PARENT1 | PARENT2)) {
>>>   			if (!(commit->object.flags & RESULT)) {
>>> @@ -876,7 +886,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
>>>   			return NULL;
>>>   	}
>>>   -	list = paint_down_to_common(one, n, twos);
>>> +	list = paint_down_to_common(one, n, twos, 0);
>>>     	while (list) {
>>>   		struct commit *commit = pop_commit(&list);
>>> @@ -943,7 +953,7 @@ static int remove_redundant(struct commit **array, int cnt)
>>>   			filled_index[filled] = j;
>>>   			work[filled++] = array[j];
>>>   		}
>>> -		common = paint_down_to_common(array[i], filled, work);
>>> +		common = paint_down_to_common(array[i], filled, work, 0);
>>>   		if (array[i]->object.flags & PARENT2)
>>>   			redundant[i] = 1;
>>>   		for (j = 0; j < filled; j++)
>>> @@ -1067,7 +1077,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
>>>   	if (commit->generation > min_generation)
>>>   		return 0;
>>>   -	bases = paint_down_to_common(commit, nr_reference, reference);
>>> +	bases = paint_down_to_common(commit, nr_reference, reference, commit->generation);
>>
>> Is it the only case where we would call paint_down_to_common() with
>> non-zero last parameter?  Would we always use commit->generation where
>> commit is the first parameter of paint_down_to_common()?
>>
>> If both are true and will remain true, then in my humble opinion it is
>> not necessary to change the signature of this function.
>
> We need to change the signature some way, but maybe the way I chose is
> not the best.

No, after taking longer I think the new signature is a good choice.

> To elaborate: paint_down_to_common() is used for multiple
> purposes. The caller here that supplies 'commit->generation' is used
> only to compute reachability (by testing if the flag PARENT2 exists on
> the commit, then clears all flags). The other callers expect the full
> walk down to the common commits, and keeps those PARENT1, PARENT2, and
> STALE flags for future use (such as reporting merge bases). Usually
> the call to paint_down_to_common() is followed by a revision walk that
> only halts when reaching root commits or commits with both PARENT1 and
> PARENT2 flags on, so always short-circuiting on generations would
> break the functionality; this is confirmed by the
> t5318-commit-graph.sh.

Right.

I have realized that just after sending the email.  I'm sorry about this.

>
> An alternative to the signature change is to add a boolean parameter
> "use_cutoff" or something, that specifies "don't walk beyond the
> commit". This may give a more of a clear description of what it will
> do with the generation value, but since we are already performing
> generation comparisons before calling paint_down_to_common() I find
> this simple enough.

Two things:

1. The signature proposed in the patch is more generic.  The cutoff does
   not need to be equal to the generation number of the commit, though
   currently it always (all of one time the new mechanism is used) is.

   So now I think the new signature of paint_down_to_common() is all
   right as it is proposed here.

2. The way generation numbers are defined (with 0 being a special case,
   and generation numbers starting from 1 for parent-less commits), and
   the way they are compared (using strict comparison, to avoid having
   to special-case _ZERO, _MAX and _INFINITY generation numbers) the
   cutoff of 0 means no cutoff.

   On the other hand cutoff of 0 can be understood as meaning no cutoff
   as a special case.

   It could be made more clear to use (as I proposed elsewhere in this
   thread) symbolic name for this no-cutoff case via preprocessor
   constants or enums, e.g. GENERATION_NO_CUTOFF:

    @@ -876,7 +886,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
      			return NULL;
      	}
      -	list = paint_down_to_common(one, n, twos);
    +	list = paint_down_to_common(one, n, twos, GENERATION_NO_CUTOFF);
        	while (list) {
      		struct commit *commit = pop_commit(&list);
    @@ -943,7 +953,7 @@ static int remove_redundant(struct commit **array, int cnt)
      			filled_index[filled] = j;
      			work[filled++] = array[j];
      		}
    -		common = paint_down_to_common(array[i], filled, work);
    +		common = paint_down_to_common(array[i], filled, work, GENERATION_NO_CUTOFF);
      		if (array[i]->object.flags & PARENT2)
      			redundant[i] = 1;
      		for (j = 0; j < filled; j++)


   But whether it makes code more readable, or less readable, is a
   matter of opinion and taste.

Best,
-- 
Jakub Narębski




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux