Re: [RFC/PATCH 1/3] revision.c: tighten up TREESAME handling of merges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28/04/2013 01:36, Junio C Hamano wrote:
Kevin Bracey <kevin@xxxxxxxxx> writes:

Historically TREESAME was set on a commit if it was TREESAME to _any_ of
its parents. This is not optimal, as such a merge could still be worth
showing, particularly if it is an odd "-s ours" merge that (possibly
accidentally) dropped a change.
"... and with options like --full-history or --simplify-merges are
used to get more complete history", I think.  "git log path" without
these options is a tool to get one version of simplified history
that explains the end result, and by definition, the side branch
merged by "-s ours" did _not_ contribute anything to the end result.

Yeah, I'm not happy with this commit message - I knocked it up separately from my first pass, which I didn't have to hand. Next version will combine it with the original, which better distinguished the default mode, and specifically addressed the "--full-history -S" search problem.

That's key - that I really want such searches to be able to track the entire life of a change on a side branch, not potentially showing just its birth as now, but also always including any ultimate merge death. (I think that we may be able to refine --ancestry-path to give an even tighter pinpoint, but --full-history should definitely include the information, as per its name).


Do we want to discard the decoration data when the commit becomes a
non-merge?

Would seem reasonable, and would also help make concrete why we update TREESAME immediately, and not in update_treesame(), but I didn't spot a mechanism to discard decoration. I'll recheck.


+		commit->object.flags |= TREESAME;
+		for (n = 0; n < st->nparents; n++) {
+			if (!st->treesame[n]) {
+				commit->object.flags &= ~TREESAME;
+				break;
+			}
+		}
Can a commit that earlier was marked as TREESAME become not TREESAME?
Wouldn't simplification only increase sameness, never decrease?

That's true - I paid attention to that earlier when it really mattered due to the cost of recalculating it with try_to_simplify_commit(). Not sure that it matters so much any more, and I don't see how we can use that information to change this "scan for !treesame" loop.

I could insert an "if (!commit->object.flags & TREESAME)" test to skip the entire update. I'd be inclined to do that as the caller of update_treesame(). I think update_treesame() itself should be general-purpose without assumptions about what changes have been made, so it's a pure treesame[]->TREESAME calculation, without TREESAME as an input.

(Aside - just occurred to me we could swap the loop for "strlen(st->treesame) == st->nparents", if we kept a zero terminator in the array. Maybe a bit too smart-ass?)



+	for (pp = &commit->parents;
+	     (parent = *pp) != NULL;
+	     pp = &parent->next, nth_parent++) {
I see the reason to change from while to for is because you wanted
to count, and I think it makes sense; but it is more readable to
initialise the counter here, too, if that is the case. I.e.

	for (pp = &commit->parents, nth_parent = 0;
	     !(parent = *pp);
	     pp = &parent->next, nth_parent++) {

Agree on nth_parent, but "!(parent = *pp)" isn't "(parent = *pp) != NULL", mind. Did you mean "!!"? In which case I still prefer it my way.


+				if (!tree_changed)
+					ts->treesame[0] = 1;
Have we made any two tree comparison at this point to set this one?
Ahh, this is tricky.  You do this in the _second_ iteration of the
loop, so tree_changed here is from inspecting the first parent, not
the one we are looking at (i.e. *p).

Yes, this is the "we've reached our second iteration, so from now on we're dealing a merge" if {} block. I'll clarify this in the comment at the top, and note that we're populating the newly-allocated treesame[] from our first iteration.


@@ -773,6 +861,9 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li
  	 * NEEDSWORK: decide if we want to remove parents that are
  	 * not marked with TMP_MARK from commit->parents for commits
  	 * in the resulting list.  We may not want to do that, though.
+	 *
+	 * Maybe it should be considered if we are TREESAME to such
+	 * parents - now possible with stored per-parent flags.
  	 */
Hmm, that is certainly a thought.

My comment's wrong though. Reconsidering, what I think needs removing is actually off-ancestry parents that we are !TREESAME to, when we are TREESAME on the ancestry path.

I've realised while testing this that there's been one thing that's confused me repeatedly, and I think this comment was an example of it. The example in the rev-list-options manual is wrong.

          .-A---M---N---O---P
         /     /   /   /   /
        I     B   C   D   E
         \   /   /   /   /
          `-------------'

Contrary to the manual, merge P is !TREESAME to E (or I). E's base is old enough that E isn't up-to-date w.r.t. "foo". Thus merge "P" is no longer TREESAME and does become subject to display with the new --full-history:

   I  A  B  N  D  O  P

I believe this is correct, because P is a merge that determined the fate of "foo", so merits --full-history inspection. (--simplify-merges obviously knocks P back out again: --simplify-merges becomes more important if --full-history gets fuller).

Given this error, and this change, I think this example may want a slight rethink. Do we want a proper "messing with other paths but TREESAME merge" example? Say if E's parent was O, P would not be TREESAME and not included in --full-history.


OK, even though the use of TMP_MARK (meant to be very localized)
across two functions feel somewhat yucky, they are file scope
statics next to each other and hopefully are called back to back.

Well, by the end of the series you've got two functions setting it, in preparation for later input to this function. And what's the upper bound on complexity of functions that may want to mark removal? They may need TMP_MARK to do the job. I'm beginning to think that it should be a dedicated REMOVE bit.

Kevin

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]