Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Tue, 2 Jun 2009, Junio C Hamano wrote: >> >> Sorry, but I do not quite understand this comment. REV_TREE_NEW can be >> treated differently from REV_TREE_DIFFERENT but that only happens if you >> know about --remove-empty option, and no scripted (and later converted to >> C) Porcelain uses that option by default. > > It's not REV_TREE_NEW, but the other way around, ie when the commit has no > contents but the parent _does_ have contents, maybe we shouldn't then look > at another parent and say "no content", and then match that other parent > (resulting in no difference). > > IOW, we are in the situation where one parent gets REV_TREE_SAME, but gets > it for a totally pointless reason, namely that neither that parent nor the > eventual merge has anything at all in that path. In that case, we simplify > towards the parent that results in the smallest diff - which in this case > is "nothing there at all". Here is a crude attempt to do this. It introduces an diff optflag that records if we checked any changes at the path level (if left unset, that means everything was pruned away by pathspec) and teaches the internal diff-tree logic to set it, so that the caller can tell between "no changes" and "there was no interesting path that matches the pathspec, so comparison between parent and child yielded nothing". Side note. I didn't bother touching the diff-files and diff-index codepaths in this patch; if the distinction turns out to be useful we should teach the flag to them as well (but I'll say it is probably not very useful shortly). After looking at its output, I have to say that I do not like it very much. Compared to "--simplify-merges", it shows too many uninteresting merges. E.g. $ git log --oneline -- git-clone.sh starts like this. 17d778e Merge branch 'dr/ceiling' 159e639 Merge branch 'lt/racy-empty' bc9c3e0 Merge branch 'jc/maint-combine-diff-pre-context' ... All of them are resolving a merge with a revision that has a scripted "git clone" (i.e. "maint" track before built-in git-clone) into a newer revision after "git clone" has become built-in. After a score or so of such uninteresting merges, we finally see what matters. b84c343 Merge branch 'db/clone-in-c' 6d9878c clone: bsd shell portability fix c904bf3 Be more careful with objects directory permissions on clone 8434c2f Build in clone e42251a Use "=" instead of "==" in condition as it is more portable ... This part is actually somewhat interesting. $ git log --oneline --graph b84c343 -- git-clone.sh looks like this: * b84c343 Merge branch 'db/clone-in-c' |\ | * 8434c2f Build in clone * | 6d9878c clone: bsd shell portability fix * | c904bf3 Be more careful with objects dir... |/ * e42251a Use "=" instead of "==" in conditi... * a2b26ac clone: detect and fail on excess p... * c20711d Silence cpio's "N blocks" output w... ... But then that was what "--simplify-merges" gave us without the patch anyway. diff.h | 1 + revision.c | 32 ++++++++++++++++++++++++++++++-- revision.h | 1 + tree-diff.c | 1 + 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/diff.h b/diff.h index 6616877..fcf1d52 100644 --- a/diff.h +++ b/diff.h @@ -66,6 +66,7 @@ typedef void (*diff_format_fn_t)(struct diff_queue_struct *q, #define DIFF_OPT_DIRSTAT_CUMULATIVE (1 << 19) #define DIFF_OPT_DIRSTAT_BY_FILE (1 << 20) #define DIFF_OPT_ALLOW_TEXTCONV (1 << 21) +#define DIFF_OPT_CHECKED_CHANGES (1 << 22) #define DIFF_OPT_TST(opts, flag) ((opts)->flags & DIFF_OPT_##flag) #define DIFF_OPT_SET(opts, flag) ((opts)->flags |= DIFF_OPT_##flag) #define DIFF_OPT_CLR(opts, flag) ((opts)->flags &= ~DIFF_OPT_##flag) diff --git a/revision.c b/revision.c index bf58448..7440f59 100644 --- a/revision.c +++ b/revision.c @@ -273,7 +273,7 @@ static void file_add_remove(struct diff_options *options, int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD; tree_difference |= diff; - if (tree_difference == REV_TREE_DIFFERENT) + if ((tree_difference & REV_TREE_DIFFERENT) == REV_TREE_DIFFERENT) DIFF_OPT_SET(options, HAS_CHANGES); } @@ -317,9 +317,13 @@ static int rev_compare_tree(struct rev_info *revs, struct commit *parent, struct tree_difference = REV_TREE_SAME; DIFF_OPT_CLR(&revs->pruning, HAS_CHANGES); + DIFF_OPT_CLR(&revs->pruning, CHECKED_CHANGES); if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "", &revs->pruning) < 0) return REV_TREE_DIFFERENT; + if (tree_difference == REV_TREE_SAME && + !DIFF_OPT_TST(&revs->pruning, CHECKED_CHANGES)) + tree_difference = REV_TREE_EMPTY; return tree_difference; } @@ -351,7 +355,7 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) { struct commit_list **pp, *parent; - int tree_changed = 0, tree_same = 0; + int tree_changed = 0, tree_same = 0, all_empty = 1; /* * If we don't do pruning, everything is interesting @@ -384,7 +388,17 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) sha1_to_hex(commit->object.sha1), sha1_to_hex(p->object.sha1)); switch (rev_compare_tree(revs, p, commit)) { + case REV_TREE_EMPTY: + /* + * This parent is the same as the child but that is + * only because no path we are interested in appears + * in either of them. Do not cull other parents (yet). + */ + pp = &parent->next; + continue; + case REV_TREE_SAME: + all_empty = 0; tree_same = 1; if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) { /* Even if a merge with an uninteresting @@ -421,12 +435,26 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) /* fallthrough */ case REV_TREE_OLD: case REV_TREE_DIFFERENT: + all_empty = 0; tree_changed = 1; pp = &parent->next; continue; + } die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1)); } + + if (all_empty && revs->simplify_history) { + /* + * No path we are interested in appears in any of the + * parents nor in this commit. Just simplify the merge away. + */ + parent = commit->parents; + parent->next = NULL; + commit->object.flags |= TREESAME; + return; + } + if (tree_changed && !tree_same) return; commit->object.flags |= TREESAME; diff --git a/revision.h b/revision.h index 227164c..7064d35 100644 --- a/revision.h +++ b/revision.h @@ -121,6 +121,7 @@ struct rev_info { #define REV_TREE_NEW 1 /* Only new files */ #define REV_TREE_OLD 2 /* Only files removed */ #define REV_TREE_DIFFERENT 3 /* Mixed changes */ +#define REV_TREE_EMPTY 4 /* Same but only because both are empty */ /* revision.c */ void read_revisions_from_stdin(struct rev_info *revs); diff --git a/tree-diff.c b/tree-diff.c index edd8394..692d1cb 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -298,6 +298,7 @@ int diff_tree(struct tree_desc *t1, struct tree_desc *t2, const char *base, stru update_tree_entry(t1); continue; } + DIFF_OPT_SET(opt, CHECKED_CHANGES); switch (compare_tree_entry(t1, t2, base, baselen, opt)) { case -1: update_tree_entry(t1); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html