Re: [PATCH 2/2] merge-recursive: fix merging a subdirectory into the root directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elijah,

On Fri, 11 Oct 2019, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@xxxxxxxxx>
>
> We allow renaming all entries in e.g. a directory named z/ into a
> directory named y/ to be detected as a z/ -> y/ rename, so that if the
> other side of history adds any files to the directory z/ in the mean
> time, we can provide the hint that they should be moved to y/.
>
> There is no reason to not allow 'y/' to be the root directory, but the
> code did not handle that case correctly.  Add a testcase and the
> necessary special checks to support this case.
>
> Signed-off-by: Elijah Newren <newren@xxxxxxxxx>

This makes sense.

> ---
>  merge-recursive.c                   | 29 +++++++++++++++
>  t/t6043-merge-rename-directories.sh | 56 +++++++++++++++++++++++++++++

It is good to have a test case verifying this!

FWIW I frequently run into those same issues because I have to use --
quite often! -- `contrib/fast-import/import-tars.perl` in the Git for
Windows project (in the MSYS2 part thereof, to be precise), and the
`pax_global_header` and I will probably never become friends, so I often
have to move files into the top-level directory...

> diff --git a/merge-recursive.c b/merge-recursive.c
> index f80e48f623..7bd4a7cf10 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1931,6 +1931,16 @@ static char *apply_dir_rename(struct dir_rename_entry *entry,
>  		return NULL;
>
>  	oldlen = strlen(entry->dir);
> +	if (entry->new_dir.len == 0)
> +		/*
> +		 * If someone renamed/merged a subdirectory into the root
> +		 * directory (e.g. 'some/subdir' -> ''), then we want to
> +		 * avoid returning
> +		 *     '' + '/filename'
> +		 * as the rename; we need to make old_path + oldlen advance
> +		 * past the '/' character.
> +		 */
> +		oldlen++;

This makes sense to me.

>  	newlen = entry->new_dir.len + (strlen(old_path) - oldlen) + 1;
>  	strbuf_grow(&new_path, newlen);
>  	strbuf_addbuf(&new_path, &entry->new_dir);
> @@ -1980,6 +1990,25 @@ static void get_renamed_dir_portion(const char *old_path, const char *new_path,
>  	    *end_of_old == *end_of_new)
>  		return; /* We haven't modified *old_dir or *new_dir yet. */
>
> +	/*
> +	 * If end_of_new got back to the beginning of its string, and
> +	 * end_of_old got back to the beginning of some subdirectory, then
> +	 * we have a rename/merge of a subdirectory into the root, which
> +	 * needs slightly special handling.
> +	 *
> +	 * Note: There is no need to consider the opposite case, with a
> +	 * rename/merge of the root directory into some subdirectory.
> +	 * Our rule elsewhere that a directory which still exists is not
> +	 * considered to have been renamed means the root directory can
> +	 * never be renamed (because the root directory always exists).
> +	 */
> +	if (end_of_new == new_path &&
> +	    end_of_old != old_path && end_of_old[-1] == '/') {
> +		*old_dir = xstrndup(old_path, end_of_old-1 - old_path);
> +		*new_dir = xstrndup(new_path, end_of_new - new_path);

However, here we write something convoluted that essentially amounts to
`xstrdup("")`. I would rather have that simple call than the convoluted
one that would puzzle me every time I have to look at this part of the
code.

While at it, would you mind either surrounding the `-` and the `1` by
spaces, or even write `--end_of_old - old_path`?

> +		return;
> +	}
> +
>  	/*
>  	 * We've found the first non-matching character in the directory
>  	 * paths.  That means the current characters we were looking at
> diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
> index c966147d5d..b920bb0850 100755
> --- a/t/t6043-merge-rename-directories.sh
> +++ b/t/t6043-merge-rename-directories.sh
> @@ -4051,6 +4051,62 @@ test_expect_success '12c-check: Moving one directory hierarchy into another w/ c
>  	)
>  '
>
> +# Testcase 12d, Rename/merge of subdirectory into the root
> +#   Commit O: a/b/{foo.c}
> +#   Commit A: foo.c
> +#   Commit B: a/b/{foo.c,bar.c}
> +#   Expected: a/b/{foo.c,bar.c}
> +
> +test_expect_success '12d-setup: Rename (merge) of subdirectory into the root' '
> +	test_create_repo 12d &&
> +	(
> +		cd 12d &&
> +
> +		mkdir -p a/b/subdir &&
> +		test_commit a/b/subdir/foo.c &&

Why `.c`? That's a little distracting.

> +
> +		git branch O &&

Might be simpler just to use `master` subsequently and not "waste" a new
ref on that.

> +		git branch A &&

Might make more sense to create it below, via the `-b` option of `git
checkout`.

Or, for extra brownie points, via the `-c` option of `git switch`.

> +		git branch B &&

Likewise, this might want to be created below, via replacing `git
checkout B` with `git switch -c B master`.

> +
> +		git checkout A &&
> +		mkdir subdir &&
> +		git mv a/b/subdir/foo.c.t subdir/foo.c.t &&
> +		test_tick &&
> +		git commit -m "A" &&
> +
> +		git checkout B &&
> +		test_commit a/b/bar.c
> +	)
> +'
> +
> +test_expect_success '12d-check: Rename (merge) of subdirectory into the root' '

For the record: I am still a huge anti-fan of splitting `setup` test
cases from the test cases that do actual things, _unless_ it is _one_,
and _only one_, big, honking `setup` test case that is the very first
one in the test script.

Splitting things into two inevitably leads to unnecessary churn when
investigating test failures.

And that's really what test cases need to be optimized for: when they
report breakages. They need to help as much as they can to investigate
why things break. Nobody cares when test cases succeed. The ~150k test
cases that pass on every CI build: nobody is interested. When a test
case reports failure, that's when people pay attention. At least some,
including me.

The most common case for me (and every other lazy person who relies on
CI/PR builds) is when a build breaks, and then I usually get to the
trace of the actually failing test case very quickly. The previous test
case's trace, not so easy. Clicks involved. Now I lose context. Not
good.

A less common case for me is when I run test scripts locally, with `-i
-v -x`. Still, I need to scroll back to get context. And then, really, I
already lost context.

> +	(
> +		cd 12d &&
> +
> +		git checkout A^0 &&
> +
> +		git -c merge.directoryRenames=true merge -s recursive B^0 &&
> +
> +		git ls-files -s >out &&
> +		test_line_count = 2 out &&

Isn't this a bit overzealous?

> +
> +		git rev-parse >actual \
> +			HEAD:subdir/foo.c.t   HEAD:bar.c.t &&
> +		git rev-parse >expect \
> +			O:a/b/subdir/foo.c.t  B:a/b/bar.c.t &&
> +		test_cmp expect actual &&

Likewise?

> +
> +		git hash-object bar.c.t >actual &&
> +		git rev-parse B:a/b/bar.c.t >expect &&
> +		test_cmp expect actual &&

Likewise?

> +
> +		test_must_fail git rev-parse HEAD:a/b/subdir/foo.c.t &&
> +		test_must_fail git rev-parse HEAD:a/b/bar.c.t &&
> +		test_path_is_missing a/

Makes sense, but the part that I am missing is

		test_path_is_file bar.c.t

I.e. the _most_ important outcome of this test is: the rename was
detected, and the added file was correctly placed into the target
directory of the rename.

Thanks,
Dscho

> +	)
> +'
> +
>  ###########################################################################
>  # SECTION 13: Checking informational and conflict messages
>  #
> --
> gitgitgadget
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux