Re: [PATCH 7/7] merge: add --rename-notes

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 21 Jan 2016 09:53:23 -0800

Nguyễn Thái Ngọc Duy  <pclouds@xxxxxxxxx> writes:

> +/*
> + * Traverse through the given notes tree, convert all "path to path"
> + * rename lines into "blob to blob" and return it. If cache_file is
> + * non-NULL, return it's content if still valid. Otherwise save the
> + * new content in it.
> + */
> +static void merge_rename_notes(struct strbuf *cache,
> +			       struct notes_tree *t, const char *cache_file)
> +{
> +	struct object_id notes_oid;
> +
> +	if (cache_file) {
> +	...
> +	}
> +
> +	strbuf_reset(cache);
> +	for_each_note(t, 0, merge_rename_note, cache);

This uses _all_ merge notes attached to _any_ commit in the history,
without even checking if the commit is involved in the current merge
being done?  How would that be useful?

I also suspect that the data structure to keep track renames by
using notes needs a better design.  You seem to have a note per
commit and one note records a set of "this goes to that", and
that is the reason why you need to discriminately read everything
under the sun.

I think the index into the notes tree for the purpose of this use
case should not be "which commit records this set of renames?",
but by "what is the destination blob of possible rename
operations?".  IOW, if a path that held blob X was moved to
another path that holds blob Z in commit A, and if a path that
held blob Y was moved to another path that holds blob Z in commit
B, attach a note to blob Z that records "moved from X in A" and
"moved from Y in B".

Then, after diffcore-rename has enumerated the potential rename
destinations, look these destination blobs up in the notes.  You
see a path with blob Z created, and you know if you have removed
path that held X or Y in the potential rename source set, you
found a manually recorded rename.  If you have ancestry
information when you do it, you could even reject "move from X to
Z" when you know commit A is not involved in the ancestry path
involved in the merge, but that is optional (as you may be
comparing two states that may not be related with each other).

For the purpose of helping "git log", a notes tree reorganized in
such a way would be useful.  Again, when you find a potential
rename destination that has blob Z as the result of the change,
you read the note attached to the blob to learn that it may have
come from a path that held blob X (if we are dealing with commit
A), or it may have come from a path that held blob Y (if we are
dealing with commit B).  So you can add a new field in diffopt
structure and allow the caller somewhere in the callchain
(log_tree_diff(), perhaps?) to pass which resulting commit it
wants the recorded renames to be used from.  When it is time for
"git log -M -p" to show commit A, diffcore-rename down in the
callchain will get the diff_queue that contains "diff A^..A",
among which there is a path that used to have blob Z that goes
away, reads the notes for blob Z and notice that commit A moved a
path with blob X that is going away was renamed to it, discarding
the other record for blob Z that talks about the change in commit
B that is not relevant for the purpose of showing commit A.

Hmm?

> @@ -1260,10 +1359,25 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>  		usage_with_options(builtin_merge_usage,
>  			builtin_merge_options);
>  
> +	if (rename_file && rename_note_ref)
> +		die(_("--rename-file and --rename-notes are incompatible"));
> +
>  	if (rename_file &&
>  	    strbuf_read_file(&manual_renames, rename_file, 0) == -1)
>  		die(_("unable to read %s"), rename_file);
>  
> +	if (rename_note_ref) {
> +		struct notes_tree rename_notes;
> +		struct strbuf ref = STRBUF_INIT;
> +
> +		strbuf_addstr(&ref, rename_note_ref);
> +		expand_notes_ref(&ref);
> +		init_notes(&rename_notes, ref.buf, NULL, 0);
> +		strbuf_release(&ref);
> +		merge_rename_notes(&manual_renames, &rename_notes,
> +				   git_path("GIT_RENAME_CACHE"));
> +	}
> +

These new hunks must move way below the codepath and be run only
after we are certain that we need to do a real merge.  The merge
logic that follows this line tries to handle light-weight special
cases (like an obvious fast-forward and already up-to-date cases)
as early as possible to avoid doing unnecessary things.

This patch (and the earlier one adds strbuf-read-file of the
rename-file) breaks the design by doing these new heavyweight
operations whose result may not be used at all too early, I
think.

If you go the route of reorganizing the rename notes around the
destination blobs, then you wouldn't even be reading everything
here very high in the callchain (you would be doing so only after
diffcore-rename decides that rename notes may be worth reading),
and that would fix this.

>  	if (!head_commit) {
>  		struct commit *remote_head;
>  		/*

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html