Re: [PATCHv3 3/6] merge-recursive: Fix D/F conflicts

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 07 Jul 2010 14:40:24 -0700

newren@xxxxxxxxx writes:

> From: Elijah Newren <newren@xxxxxxxxx>
>
> The D/F conflicts that can be automatically resolved (file or directory
> unmodified on one side of history), have the nice property that
> process_entry() can correctly handle all subpaths of the D/F conflict.  In
> the case of D->F conversions, it will correctly delete all non-conflicting
> files below the relevant directory and the directory itself (note that both
> untracked and conflicting files below the directory will prevent its
> removal).  So if we handle D/F conflicts after all other conflicts, they
> become fairly simple to handle -- we just need to check for whether or not
> a path (file/directory) is in the way of creating the new content.  We do
> this by having process_entry() defer handling such entries to a subsequent
> process_df_entry() step.

The basic idea of this patch to process all removals first and then
process additions in a separate step feels very sound.  It is exactly the
same strategy that is used by "git apply" to avoid a similar issue of
applying a patch that removes "a/b" and creates "a" at the same time.

And it feels both correct and sufficient to have this additional logic
only in the "Added in one, but the path happens to be a directory on the
other branch" codepath.  The only case you will end up having to create D
(blob) where D used to be a directory (hence needing to make sure
everything under D/ is gone) is when you are the only one adding D (and
the other party is not adding nor modifying it---if it were, it means that
the other party also wants D as a blob and not a directory, and there
won't be D/F conflict in such a case).

> Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> ---
>  merge-recursive.c               |   93 ++++++++++++++++++++++++++++++++-------
>  t/t6035-merge-dir-to-symlink.sh |    8 ++--
>  2 files changed, 81 insertions(+), 20 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 206c103..865729a 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1072,6 +1072,7 @@ static int process_entry(struct merge_options *o,
>  	unsigned char *a_sha = stage_sha(entry->stages[2].sha, a_mode);
>  	unsigned char *b_sha = stage_sha(entry->stages[3].sha, b_mode);
>  
> +	entry->processed = 1;

In the original code, process_renames() used to use this to signal the
loop that drives process_entry() that the stage data has been dealt with,
and this function was unaware of the flag.  Now it uses it to communicate
the same fact to its downstream (process_df_entry), which makes sense, but
somehow it made me feel a bit uneasy.

> @@ -1178,6 +1174,64 @@ static int process_entry(struct merge_options *o,
>  	return clean_merge;
>  }
>  
> +/* Per entry merge function for D/F conflicts, to be called only after
> + * all files below dir have been processed.  We do this because in the
> + * cases we can cleanly resolve D/F conflicts, process_entry() can clean
> + * out all the files below the directory for us.
> + */

/*
 * Please format large comment like
 * this.
 */

Style aside, what happens when we cannot cleanly resolve D/F conflicts,
i.e. when some untracked/modified paths still remain after the earlier
removal?  We would signal conflict but at that point what happens to the
files in the working tree that are involved in this new codepath?  They
are already gone, I presume, and they at least match the original index
entries, so not much is lost, but what would the recovery procedure out of
the resulting mess?  "reset --hard"?

> +static int process_df_entry(struct merge_options *o,
> +			 const char *path, struct stage_data *entry)
> +{
> +	int clean_merge = 1;
> +	unsigned o_mode = entry->stages[1].mode;
> +	unsigned a_mode = entry->stages[2].mode;
> +	unsigned b_mode = entry->stages[3].mode;
> +	unsigned char *o_sha = stage_sha(entry->stages[1].sha, o_mode);
> +	unsigned char *a_sha = stage_sha(entry->stages[2].sha, a_mode);
> +	unsigned char *b_sha = stage_sha(entry->stages[3].sha, b_mode);
> +
> +	/* We currently only handle D->F cases */
> +	assert((!o_sha && a_sha && !b_sha) ||
> +	       (!o_sha && !a_sha && b_sha));
> +	const char *add_branch;
> +	const char *other_branch;
> +	unsigned mode;
> +	const unsigned char *sha;
> +	const char *conf;

As an assert becomes code under debugging build, do not put one before
declarations like this.

Also "struct stat st" later in this file needs to be lifted up for the
same reason to avoid decl-after-statement.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html