Re: [PATCH] Add the --submodule-summary option to the diff option family

Junio C Hamano <gitster@xxxxxxxxx> · Sun, 04 Oct 2009 15:19:25 -0700

Johannes Schindelin <johannes.schindelin@xxxxxx> writes:

> Now you can see the submodule summaries inlined in the diff, instead of
> not-quite-helpful SHA-1 pairs.

This looks useful, but I do not think this is --summary.  It is not part
of the "summary" output but is about making output for Submoodules more
verbose.  I'd suggest naming it --verbose-submodule or something.

> The format imitates what "git submodule summary" shows.

The output format needs to be described better here and also in
Documentation/diff-format.txt.

> To do that, <path>/.git/objects/ is added to the alternate object
> databases (if that directory exists).

Is it always true that <path>/.git is the GIT_DIR for that submodule?  Not
complaining but checking sanity.

> diff --git a/diff.c b/diff.c
> index 9e00131..cdef322 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1557,6 +1558,17 @@ static void builtin_diff(const char *name_a,
>  	const char *a_prefix, *b_prefix;
>  	const char *textconv_one = NULL, *textconv_two = NULL;
>  
> +	if (DIFF_OPT_TST(o, SUMMARIZE_SUBMODULES) &&
> +			(!one->mode || S_ISGITLINK(one->mode)) &&
> +			(!two->mode || S_ISGITLINK(two->mode))) {
> +		const char *del = diff_get_color_opt(o, DIFF_FILE_OLD);
> +		const char *add = diff_get_color_opt(o, DIFF_FILE_NEW);
> +		show_submodule_summary(o->file, one ? one->path : two->path,
> +				one->sha1, two->sha1,
> +				del, add, reset);
> +		return;
> +	}
> +

Isn't this "textual diff" codepath?

I would have expected --submodule-summary output to be near the --summary
output and to be generated in the same codepath to generate it; if this
were named --verbose-submodule, I would certainly understand it, and I
think the placement is saner in the textual diff.

> diff --git a/submodule.c b/submodule.c
> new file mode 100644
> index 0000000..3f2590d
> --- /dev/null
> +++ b/submodule.c
> @@ -0,0 +1,108 @@
> +#include "cache.h"
> +#include "submodule.h"
> +#include "dir.h"
> +#include "diff.h"
> +#include "commit.h"
> +#include "revision.h"
> +
> +int add_submodule_odb(const char *path)
> +{
> +	struct strbuf objects_directory = STRBUF_INIT;
> +	struct alternate_object_database *alt_odb;
> +
> +	strbuf_addf(&objects_directory, "%s/.git/objects/", path);
> +	if (!is_directory(objects_directory.buf))
> +		return -1;
> +
> +	/* avoid adding it twice */
> +	for (alt_odb = alt_odb_list; alt_odb; alt_odb = alt_odb->next)
> +		if (alt_odb->name - alt_odb->base == objects_directory.len &&
> +				!strncmp(alt_odb->base, objects_directory.buf,
> +					objects_directory.len))
> +			return 0;
> +	alt_odb = xmalloc(objects_directory.len + 42 + sizeof(*alt_odb));
> +	alt_odb->next = alt_odb_list;
> +	strcpy(alt_odb->base, objects_directory.buf);
> +	alt_odb->name = alt_odb->base + objects_directory.len;
> +	alt_odb->name[2] = '/';
> +	alt_odb->name[40] = '\0';
> +	alt_odb->name[41] = '\0';
> +	alt_odb_list = alt_odb;
> +	prepare_alt_odb();

The lines after "avoid adding it twice" look somewhat familiar.

Don't we already have other codepaths to add an alternate odb that need to
be careful in the same way (i.e. do not add twice, do not loop, etc.), and
if so don't you want to reuse it after refactoring?

> +void show_submodule_summary(FILE *f, const char *path,
> +		unsigned char one[20], unsigned char two[20],
> +		const char *del, const char *add, const char *reset)
> +{
> +	struct rev_info rev;
> +	struct commit *commit, *left, *right;
> +	struct commit_list *merge_bases, *list;
> +	const char *message = NULL;
> +	struct strbuf sb = STRBUF_INIT;
> +	static const char *format = "    %m %s";
> +	int fast_forward = 0, fast_backward = 0;
> +
> +	if (add_submodule_odb(path))
> +		message = "(not checked out)";
> +	else if (is_null_sha1(one))
> +		message = "(new submodule)";
> +	else if (is_null_sha1(two))
> +		message = "(submodule deleted)";

Are you sure about this?  Wouldn't "git diff HEAD" (not looking at the
index) give you the 0{40} on the "two" side when the path is modified?

> +	if (!message) {
> +		init_revisions(&rev, NULL);
> +		setup_revisions(0, NULL, &rev, NULL);
> +		rev.left_right = 1;
> +		left->object.flags |= SYMMETRIC_LEFT;
> +		add_pending_object(&rev, &left->object, path);
> +		add_pending_object(&rev, &right->object, path);
> +		merge_bases = get_merge_bases(left, right, 1);
> +		if (merge_bases) {
> +			if (merge_bases->item == left)
> +				fast_forward = 1;
> +			else if (merge_bases->item == right)
> +				fast_backward = 1;
> +		}
> +		for (list = merge_bases; list; list = list->next) {
> +			list->item->object.flags |= UNINTERESTING;
> +			add_pending_object(&rev, &list->item->object,
> +				sha1_to_hex(list->item->object.sha1));
> +		}
> +		if (prepare_revision_walk(&rev))
> +			message = "(revision walker failed)";

If prepare_revision_walk() failed for whatever reason, can we trust
fast_forward/fast_backward at this point?

> +	}
> +
> +	strbuf_addf(&sb, "Submodule %s %s..", path,
> +			find_unique_abbrev(one, DEFAULT_ABBREV));
> +	if (!fast_backward && !fast_forward)
> +		strbuf_addch(&sb, '.');
> +	strbuf_addf(&sb, "%s", find_unique_abbrev(two, DEFAULT_ABBREV));
> +	if (message)
> +		strbuf_addf(&sb, " %s\n", message);
> +	else
> +		strbuf_addf(&sb, "%s:\n", fast_backward ? " (rewind)" : "");

No corresponding "(fast-forward)" label?

> +	fwrite(sb.buf, sb.len, 1, f);
> +
> +	if (!message) {
> +		while ((commit = get_revision(&rev))) {
> +			strbuf_setlen(&sb, 0);
> +			if (del)
> +				strbuf_addstr(&sb, commit->object.flags &
> +						SYMMETRIC_LEFT ? del : add);
> +			format_commit_message(commit, format, &sb,
> +					rev.date_mode);
> +			if (del)
> +				strbuf_addstr(&sb, reset);

Three points, two of which are minor.

 - Checking "del" to decide if you want to say "reset" feels funny.

 - In the "ANSI-terminal only" world view, adding colors to strbuf and
   writing it out together with the actual strings is an easy thing to do.
   Don't Windows folks have trouble converting this kind of code to their
   color control call that is separate from writing strings out?  If it is
   not a problem, I do not have any objection to it, but otherwise I'd
   suggest not to add any more code that stores color escape sequence in
   strbuf, so that we would not make later conversion by Windows folks
   harder than necessary.

 - The output is similar but different from "submodule --summary" and
   there is no justification described in the patch nor the proposed
   commit log message.

   o Why does "submodule --summary" use --first-parent and this patch
     doesn't?

   o "submodule --summary" allows users to limit the walk but this seems
     to give the full list---is it useful (or even sane) to always give
     a full list?

> +			strbuf_addch(&sb, '\n');
> +			fwrite(sb.buf, sb.len, 1, f);
> +		}
> +		clear_commit_marks(left, ~0);
> +		clear_commit_marks(right, ~0);
> +	}

Aren't object flags shared between the outer revision walker that walks
the log and this new walker?  If the supermodule history and submodule
history share commits (e.g. when later git.git starts using submodules to
bind gitk and git-gui), wouldn't this break "git log -p" a big way?

In any case, thanks for an interesting patch.  Looking forward to see how
this topic evolves.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html