On 4/18/2018 7:19 PM, Jakub Narebski wrote:
Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:
[...]
[...], and this saves time during 'git branch --contains' queries
that would otherwise walk "around" the commit we are inspecting.
If I understand the code properly, what happens is that we can now
short-circuit if all commits that are left are lower than the target
commit.
This is because max-order priority queue is used: if the commit with
maximum generation number is below generation number of target commit,
then target commit is not reachable from any commit in the priority
queue (all of which has generation number less or equal than the commit
at head of queue, i.e. all are same level or deeper); compare what I
have written in [1]
[1]: https://public-inbox.org/git/866052dkju.fsf@xxxxxxxxx/
Do I have that right? If so, it looks all right to me.
Yes, the priority queue needs to compare via generation number first or
there will be errors. This is why we could not use commit time before.
For a copy of the Linux repository, where HEAD is checked out at
v4.13~100, we get the following performance improvement for
'git branch --contains' over the previous commit:
Before: 0.21s
After: 0.13s
Rel %: -38%
[...]
flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
if (flags == (PARENT1 | PARENT2)) {
if (!(commit->object.flags & RESULT)) {
@@ -876,7 +886,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co
return NULL;
}
- list = paint_down_to_common(one, n, twos);
+ list = paint_down_to_common(one, n, twos, 0);
while (list) {
struct commit *commit = pop_commit(&list);
@@ -943,7 +953,7 @@ static int remove_redundant(struct commit **array, int cnt)
filled_index[filled] = j;
work[filled++] = array[j];
}
- common = paint_down_to_common(array[i], filled, work);
+ common = paint_down_to_common(array[i], filled, work, 0);
if (array[i]->object.flags & PARENT2)
redundant[i] = 1;
for (j = 0; j < filled; j++)
@@ -1067,7 +1077,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit *
if (commit->generation > min_generation)
return 0;
- bases = paint_down_to_common(commit, nr_reference, reference);
+ bases = paint_down_to_common(commit, nr_reference, reference, commit->generation);
Is it the only case where we would call paint_down_to_common() with
non-zero last parameter? Would we always use commit->generation where
commit is the first parameter of paint_down_to_common()?
If both are true and will remain true, then in my humble opinion it is
not necessary to change the signature of this function.
We need to change the signature some way, but maybe the way I chose is
not the best.
To elaborate: paint_down_to_common() is used for multiple purposes. The
caller here that supplies 'commit->generation' is used only to compute
reachability (by testing if the flag PARENT2 exists on the commit, then
clears all flags). The other callers expect the full walk down to the
common commits, and keeps those PARENT1, PARENT2, and STALE flags for
future use (such as reporting merge bases). Usually the call to
paint_down_to_common() is followed by a revision walk that only halts
when reaching root commits or commits with both PARENT1 and PARENT2
flags on, so always short-circuiting on generations would break the
functionality; this is confirmed by the t5318-commit-graph.sh.
An alternative to the signature change is to add a boolean parameter
"use_cutoff" or something, that specifies "don't walk beyond the
commit". This may give a more of a clear description of what it will do
with the generation value, but since we are already performing
generation comparisons before calling paint_down_to_common() I find this
simple enough.
Thanks,
-Stolee