On Sun, 20 Jan 2008, Linus Torvalds wrote: > > The following patch gets us closer, but really only for the "--cached" > case (for the non-cached case, it should use the working tree entry ratehr > than generate a unmerged entry), and I also suspect there's a case it gets > wrong for the case of the unmerged path not existign in the tree at all > (the "if (tree)" entry basically ends up being a stand-in for the "is this > the first index entry for this path we see" case). This cleanup patch really looks big and fairly ugly, but it splits up the magic rules for "unpack_trees()" and adds a lot of comments about what is going on, and in the presense lays the groundwork for doing a much better job on unmerged entries - it now keeps track of how many index entries we have that match the tree entry we found, so it should be pretty trivial to now add the logic to do a combined diff.. It replaces the previous patch, and should fix both of the issues I mention above. It adds a lot more lines than it removes, but much of that is due to some comments and a lot cleaner abstractions, so that the resulting code is, I think, quite a bit more obvious. The unpack_trees() interface really isn't trivial (it can't be: the things we need to do with the index for a merge - at the same time as traversing it! - are quite involved). So clarifying all that is definitely worth it. With this patch in place, I think Dscho's two other patches are now good to go, and we now generate the same output we used to. In addition, we now have the infrastructure to generate *better* output, so if we want to make "git diff HEAD" generate a nice combined diff, this sets the stage for that. Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> --- diff-lib.c | 112 ++++++++++++++++++++++++++++++++++++++++++++++++------------ 1 files changed, 90 insertions(+), 22 deletions(-) diff --git a/diff-lib.c b/diff-lib.c index 23d0fa6..2a3a9ff 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -668,45 +668,113 @@ static void mark_merge_entries(void) } } -static int oneway_diff(struct cache_entry **src, - struct unpack_trees_options *o, - int remove) +/* + * This gets a mix of an existing index and a tree, one pathname entry + * at a time. The index entry may be a single stage-0 one, but it could + * also be multiple unmerged entries (in which case you idx_pos/idx_nr + * will give you the position and number of entries in the index). + */ +static void do_oneway_diff(struct unpack_trees_options *o, + struct cache_entry *idx, + struct cache_entry *tree, + int idx_pos, int idx_nr) { - struct cache_entry *idx = src[0]; - struct cache_entry *tree = src[1]; struct rev_info *revs = o->unpack_data; - /* - * Unpack-trees generates a DF/conflict entry if - * there was a directory in the index and a tree - * in the tree. From a diff standpoint, that's a - * delete of the tree and a create of the file. - */ - if (tree == o->df_conflict_entry) - tree = NULL; + if (o->index_only && idx && ce_stage(idx)) { + if (tree) + diff_unmerge(&revs->diffopt, idx->name, idx->ce_mode, idx->sha1); + return; + } /* * Something added to the tree? */ if (!tree) { - if (ce_path_match(idx, revs->prune_data)) - show_new_file(revs, idx, o->index_only, 0); - return 1; + show_new_file(revs, idx, o->index_only, 0); + return; } /* * Something removed from the tree? */ if (!idx) { - if (ce_path_match(tree, revs->prune_data)) - diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode); - return 0; + diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode); + return; } /* Show difference between old and new */ - if (ce_path_match(idx, revs->prune_data)) - show_modified(revs, tree, idx, 1, o->index_only, 0); - return 1; + show_modified(revs, tree, idx, 1, o->index_only, 0); +} + +/* + * Count how many index entries go with the first one + */ +static inline int count_skip(const struct cache_entry *src, int pos) +{ + int skip = 1; + + /* We can only have multiple entries if the first one is not stage-0 */ + if (ce_stage(src)) { + struct cache_entry **p = active_cache + pos; + int namelen = ce_namelen(src); + + for (;;) { + const struct cache_entry *ce; + pos++; + if (pos >= active_nr) + break; + ce = *++p; + if (ce_namelen(ce) != namelen) + break; + if (memcmp(ce->name, src->name, namelen)) + break; + skip++; + } + } + return skip; +} + +/* + * The unpack_trees() interface is designed for merging, so + * the different source entries are designed primarily for + * the source trees, with the old index being really mainly + * used for being replaced by the result. + * + * For diffing, the index is more important, and we only have a + * single tree. + * + * We're supposed to return how many index entries we want to skip. + * + * This wrapper makes it all more readable, and takes care of all + * the fairly complex unpack_trees() semantic requirements, including + * the skipping, the path matching, the type conflict cases etc. + */ +static int oneway_diff(struct cache_entry **src, + struct unpack_trees_options *o, + int index_pos) +{ + int skip = 0; + struct cache_entry *idx = src[0]; + struct cache_entry *tree = src[1]; + struct rev_info *revs = o->unpack_data; + + if (index_pos >= 0) + skip = count_skip(idx, index_pos); + + /* + * Unpack-trees generates a DF/conflict entry if + * there was a directory in the index and a tree + * in the tree. From a diff standpoint, that's a + * delete of the tree and a create of the file. + */ + if (tree == o->df_conflict_entry) + tree = NULL; + + if (ce_path_match(idx ? idx : tree, revs->prune_data)) + do_oneway_diff(o, idx, tree, index_pos, skip); + + return skip; } int run_diff_index(struct rev_info *revs, int cached) - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html