On 4/10/2018 11:02 PM, Junio C Hamano wrote:
Derrick Stolee <dstolee@xxxxxxxxxxxxx> writes:
@@ -800,17 +810,26 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc
return result;
}
prio_queue_put(&queue, one);
+ if (one->generation < min_nonstale_gen)
+ min_nonstale_gen = one->generation;
for (i = 0; i < n; i++) {
twos[i]->object.flags |= PARENT2;
prio_queue_put(&queue, twos[i]);
+ if (twos[i]->generation < min_nonstale_gen)
+ min_nonstale_gen = twos[i]->generation;
}
- while (queue_has_nonstale(&queue)) {
+ while (queue_has_nonstale(&queue, min_nonstale_gen)) {
struct commit *commit = prio_queue_get(&queue);
struct commit_list *parents;
int flags;
+ if (commit->generation > last_gen)
+ BUG("bad generation skip");
+
+ last_gen = commit->generation;
+
flags = commit->object.flags & (PARENT1 | PARENT2 | STALE);
if (flags == (PARENT1 | PARENT2)) {
if (!(commit->object.flags & RESULT)) {
@@ -830,6 +849,10 @@ static struct commit_list *paint_down_to_common(struct commit *one, int n, struc
return NULL;
p->object.flags |= flags;
Hmph. Can a commit that used to be not stale (and contributed to
the current value of min_nonstale_gen) become stale here by getting
visited twice, invalidating the value in min_nonstale_gen?
min_nonstale_gen can be "wrong" in the way you say, but fits the
definition from the commit message:
"To properly take advantage of this condition, track the minimum
generation number of a commit that **enters the queue** with nonstale
status." (Emphasis added)
You make an excellent point about how this can be problematic. I was
confused by the lack of clear performance benefits here, but I think
that whatever benefits making queue_has_nonstale() be O(1) were removed
by walking more commits than necessary.
Consider the following commit graph, where M is a parent of both A and
B, S is a parent of M and B, and there is a large set of commits
reachable from M with generation number larger than gen(S).
A B
| __/|
|/ |
M |
|\ |
. | |
. | |
. |_/
|/
S
Between A and B, the true merge base is M. Anything reachable from M is
marked as stale. When S is added to the queue, it is only reachable from
B, so it is non-stale. However, it is marked stale after M is walked.
The old code would detect this as a termination condition, but the new
code would not.
I think this data shape is actually common (not exactly, as it may be
that some ancestor of M provides a second path to S) especially in the
world of pull requests and users merging master into their topic branches.
I'll remove this commit in the next version, but use the new prototype
for queue_has_nonstale() in "commit: add short-circuit to
paint_down_to_common()" using the given 'min_generation' instead of
'min_nonstale_gen'.
Thanks,
-Stolee