Re: [PATCH 6/8] commit: provide a fast multi-tip contains function

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 26 Jun 2014 11:55:28 -0700

Jeff King <peff@xxxxxxxx> writes:

> I haven't quite convinced myself that the stale logic in the middle is
> right. The origin paint_down function checks "PARENT1 | PARENT2" to see
> if we found a merge base (even though PARENT2 may represent many tips).
> Here I check whether we have _any_ "left" parent flag and _any_ "right"
> parent flag. I'm not sure if I actually need to be finding the merge
> base of _all_ of the commits.

In the one-each-on-two-side case (i.e. the original algorithm), it
is "any commit we will encounter by digging further down from a
STALE one will all be reachable from a newer merge base (e.g. the
commit that caused it to be marked as STALE originally).  It will
never be a useful merge base, so let's mark it as STALE.  Even if a
future traversal that comes from sideways (i.e. not passing the
commit that caused it to be marked as STALE) reach this STALE one,
digging further from there won't give us anything new."

If you see a commit can be reached from L1 and R2, the only thing
you know is that its parents can also be reached from L1 and R2, but
it does not tell you if it is reachable from other tips, e.g. L2 or
R1.  When a traversal reaches such a node from sideways, trying to
paint it with L2 for example, you do need to continue digging.

I think the traversal optimization based on the STALE bit is merely
a halfway optimization to cull obvious duplicates.  See how
get_merge_bases_many() needs to clean up redundant ones in the
result of merge_bases_many(), the latter of which does use the STALE
bit to remove obvious duplicates, in order to make sure it won't
include a commit in the result that can be reached from another
commit in the result, without having to resort to the SLOP trick.

Because your end-goal is to tell if Ln is reachable from Rm (for
number of n's and m's), coming up with the independent/non-redundant
set of merge-bases is not necessary, I think.  I suspect that you do
not need the STALE half-optimization in the first place.

The only time you can say "Ah, we've seen this one and no need to
dig further" is when you are propagating a colour C and the parent
in question is already painted in C (it is OK to be painted as
reachable from more tips), I would think, so shouldn't the loop be
more like, after painting each tip and placing it in the queue:

 * Dequeue one, check the L/R bits in it and call that C

 * Iterate over its parents P:

   * check the L/R bits in P and call that Cp.

   * If Cp is already a superset of C, there is no point putting P
     to the queue to be processed.

   * If Cp is not a superset of C, then update L/R bits in P to mark
     it reachable from tips represented by C and put P to the queue.

until you ran out of queue?

> +void commit_contains(const struct commit_list *left,
> +		     const struct commit_list *right,
> +		     unsigned char *left_contains,
> +		     unsigned char *right_contains)
> +{
> +	struct prio_queue queue = { compare_commits_by_commit_date };
> +	struct bit_slab left_bits, right_bits, stale_bits;
> +	int left_nr, right_nr;
> +
> +	left_nr = init_contains_bits(left, &left_bits, &queue);
> +	right_nr = init_contains_bits(right, &right_bits, &queue);
> +	init_bit_slab(&stale_bits);
> +
> +	while (queue_has_nonstale_bits(&queue, &stale_bits)) {
> +		struct commit *commit = prio_queue_get(&queue);
> +		struct commit_list *parents;
> +		unsigned char *c_left, *c_right, *c_stale;
> +
> +		c_left = bit_slab_at(&left_bits, commit);
> +		c_right = bit_slab_at(&right_bits, commit);
> +		c_stale = bit_slab_at(&stale_bits, commit);
> +
> +		if (!bitset_empty(c_left, left_nr) &&
> +		    !bitset_empty(c_right, right_nr))
> +			*c_stale = 1;

Hmph, is this one-char-a bit?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html