Derrick Stolee <stolee@xxxxxxxxx> writes: > On 4/18/2018 7:19 PM, Jakub Narebski wrote: >> Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes: >> > [...] >>> [...], and this saves time during 'git branch --contains' queries >>> that would otherwise walk "around" the commit we are inspecting. >>> >> If I understand the code properly, what happens is that we can now >> short-circuit if all commits that are left are lower than the target >> commit. >> >> This is because max-order priority queue is used: if the commit with >> maximum generation number is below generation number of target commit, >> then target commit is not reachable from any commit in the priority >> queue (all of which has generation number less or equal than the commit >> at head of queue, i.e. all are same level or deeper); compare what I >> have written in [1] >> >> [1]: https://public-inbox.org/git/866052dkju.fsf@xxxxxxxxx/ >> >> Do I have that right? If so, it looks all right to me. > > Yes, the priority queue needs to compare via generation number first > or there will be errors. This is why we could not use commit time > before. I was more concerned about getting right the order in the priority queue (does it return minimal or maximal generation number). I understand that the cutoff could not be used without generation numbers because of the possibility of clock skew - using cutoff on dates could lead to wrong results. >>> For a copy of the Linux repository, where HEAD is checked out at >>> v4.13~100, we get the following performance improvement for >>> 'git branch --contains' over the previous commit: >>> >>> Before: 0.21s >>> After: 0.13s >>> Rel %: -38% >> [...] >>> flags = commit->object.flags & (PARENT1 | PARENT2 | STALE); >>> if (flags == (PARENT1 | PARENT2)) { >>> if (!(commit->object.flags & RESULT)) { >>> @@ -876,7 +886,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co >>> return NULL; >>> } >>> - list = paint_down_to_common(one, n, twos); >>> + list = paint_down_to_common(one, n, twos, 0); >>> while (list) { >>> struct commit *commit = pop_commit(&list); >>> @@ -943,7 +953,7 @@ static int remove_redundant(struct commit **array, int cnt) >>> filled_index[filled] = j; >>> work[filled++] = array[j]; >>> } >>> - common = paint_down_to_common(array[i], filled, work); >>> + common = paint_down_to_common(array[i], filled, work, 0); >>> if (array[i]->object.flags & PARENT2) >>> redundant[i] = 1; >>> for (j = 0; j < filled; j++) >>> @@ -1067,7 +1077,7 @@ int in_merge_bases_many(struct commit *commit, int nr_reference, struct commit * >>> if (commit->generation > min_generation) >>> return 0; >>> - bases = paint_down_to_common(commit, nr_reference, reference); >>> + bases = paint_down_to_common(commit, nr_reference, reference, commit->generation); >> >> Is it the only case where we would call paint_down_to_common() with >> non-zero last parameter? Would we always use commit->generation where >> commit is the first parameter of paint_down_to_common()? >> >> If both are true and will remain true, then in my humble opinion it is >> not necessary to change the signature of this function. > > We need to change the signature some way, but maybe the way I chose is > not the best. No, after taking longer I think the new signature is a good choice. > To elaborate: paint_down_to_common() is used for multiple > purposes. The caller here that supplies 'commit->generation' is used > only to compute reachability (by testing if the flag PARENT2 exists on > the commit, then clears all flags). The other callers expect the full > walk down to the common commits, and keeps those PARENT1, PARENT2, and > STALE flags for future use (such as reporting merge bases). Usually > the call to paint_down_to_common() is followed by a revision walk that > only halts when reaching root commits or commits with both PARENT1 and > PARENT2 flags on, so always short-circuiting on generations would > break the functionality; this is confirmed by the > t5318-commit-graph.sh. Right. I have realized that just after sending the email. I'm sorry about this. > > An alternative to the signature change is to add a boolean parameter > "use_cutoff" or something, that specifies "don't walk beyond the > commit". This may give a more of a clear description of what it will > do with the generation value, but since we are already performing > generation comparisons before calling paint_down_to_common() I find > this simple enough. Two things: 1. The signature proposed in the patch is more generic. The cutoff does not need to be equal to the generation number of the commit, though currently it always (all of one time the new mechanism is used) is. So now I think the new signature of paint_down_to_common() is all right as it is proposed here. 2. The way generation numbers are defined (with 0 being a special case, and generation numbers starting from 1 for parent-less commits), and the way they are compared (using strict comparison, to avoid having to special-case _ZERO, _MAX and _INFINITY generation numbers) the cutoff of 0 means no cutoff. On the other hand cutoff of 0 can be understood as meaning no cutoff as a special case. It could be made more clear to use (as I proposed elsewhere in this thread) symbolic name for this no-cutoff case via preprocessor constants or enums, e.g. GENERATION_NO_CUTOFF: @@ -876,7 +886,7 @@ static struct commit_list *merge_bases_many(struct commit *one, int n, struct co return NULL; } - list = paint_down_to_common(one, n, twos); + list = paint_down_to_common(one, n, twos, GENERATION_NO_CUTOFF); while (list) { struct commit *commit = pop_commit(&list); @@ -943,7 +953,7 @@ static int remove_redundant(struct commit **array, int cnt) filled_index[filled] = j; work[filled++] = array[j]; } - common = paint_down_to_common(array[i], filled, work); + common = paint_down_to_common(array[i], filled, work, GENERATION_NO_CUTOFF); if (array[i]->object.flags & PARENT2) redundant[i] = 1; for (j = 0; j < filled; j++) But whether it makes code more readable, or less readable, is a matter of opinion and taste. Best, -- Jakub Narębski