Re: [PATCHv3] submodule: Port resolve_relative_url from shell to C

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 13.01.2016 um 01:10 schrieb Stefan Beller:
> Later on we want to deprecate the `git submodule init` command and make
> it implicit in other submodule commands. As these other commands are
> written in C already, we'd need the init functionality in C, too.
> The `resolve_relative_url` function is a major part of that init
> functionality, so start by porting this function to C.

Maybe tone down the word "major" to "a large and non-trivial function"?
Otherwise, the lack of proof for the claim is irritating. (As we saw,
the savings with the port to C are not breath-taking. But we have to
start somewhere.)

> 
> As I was porting the functionality I noticed some odds with the inputs.
> To fully understand the situation I added some logging to the function
> temporarily to capture all calls to the function throughout the test
> suite. Duplicates have been removed and all unique testing inputs have
> been recorded into t0060.
> 
> Signed-off-by: Stefan Beller <sbeller@xxxxxxxxxx>
> ---
...
> +/*
> + * The `url` argument is the URL that navigates to the submodule origin
> + * repo. When relative, this URL is relative to the superproject origin
> + * URL repo. The `up_path` argument, if specified, is the relative
> + * path that navigates from the submodule working tree to the superproject
> + * working tree. Returns the origin URL of the submodule.
> + *
> + * Return either an absolute URL or filesystem path (if the superproject
> + * origin URL is an absolute URL or filesystem path, respectively) or a
> + * relative file system path (if the superproject origin URL is a relative
> + * file system path).
> + *
> + * When the output is a relative file system path, the path is either
> + * relative to the submodule working tree, if up_path is specified, or to
> + * the superproject working tree otherwise.
> + */
> +static char *relative_url(const char *remote_url,
> +				const char *url,
> +				const char *up_path)
> +{
> +	int is_relative = 0;
> +	int colonsep = 0;
> +	char *out;
> +	char *remoteurl = xstrdup(remote_url);
> +	struct strbuf sb = STRBUF_INIT;
> +	size_t len;
> +
> +	len = strlen(remoteurl);
> +	if (is_dir_sep(remoteurl[len]))
> +		remoteurl[len] = '\0';
> +
> +	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
> +		is_relative = 0;
> +	else {
> +		is_relative = 1;
> +
> +		/* Prepend a './' to ensure all relative remoteurls start
> +		 * with './' or '../'. */
> +		if (!starts_with_dot_slash(remoteurl) &&
> +		    !starts_with_dot_dot_slash(remoteurl)) {
> +			strbuf_reset(&sb);
> +			strbuf_addf(&sb, "./%s", remoteurl);
> +			free(remoteurl);
> +			remoteurl = strbuf_detach(&sb, NULL);
> +		}
> +	}
> +	/* When the url starts with '../', remove that and the
> +	 * last directory in remoteurl. */
> +	while (url) {
> +		if (starts_with_dot_dot_slash(url)) {
> +			char *rfind;
> +			url += 3;
> +
> +			rfind = last_dir_separator(remoteurl);
> +			if (rfind)
> +				*rfind = '\0';
> +			else {
> +				rfind = strrchr(remoteurl, ':');
> +				if (rfind) {
> +					*rfind = '\0';
> +					colonsep = 1;
> +				} else {
> +					if (is_relative || !strcmp(".", remoteurl))
> +						die(N_("cannot strip one component off url '%s'"), remoteurl);

This must be _("..."), not N_("..."), otherwise no translation
happens at run-time.

> +					else
> +						remoteurl = ".";

This must be xstrdup(".") because remoteurl is free()d below.

> +				}
> +			}
> +		} else if (starts_with_dot_slash(url)) {
> +			url += 2;
> +		} else
> +			break;
> +	}
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
> +
> +	if (starts_with_dot_slash(sb.buf))
> +		out = xstrdup(sb.buf + 2);
> +	else
> +		out = xstrdup(sb.buf);
> +	strbuf_reset(&sb);
> +
> +	free(remoteurl);
> +	if (!up_path || !is_relative)
> +		return out;

Nit: This conditional clearly feels like (and is) an early function
exit...

> +	else {
> +		strbuf_addf(&sb, "%s%s", up_path, out);
> +
> +		free(out);
> +		return strbuf_detach(&sb, NULL);
> +	}

... hence, you don't need to place the rest of the function into the
else branch.

> +}

up_path seems to be ignored when remoteurl is absolute. Is that
combination an invalid use case?

I think that you strike a good balance between a direct rewrite
of the shell function and possible optimizations. Therefore,
further improvements should go into separate patches.

> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
> index 627ef85..2ae1bbd 100755
> --- a/t/t0060-path-utils.sh
> +++ b/t/t0060-path-utils.sh
> @@ -19,6 +19,12 @@ relative_path() {
>   	"test \"\$(test-path-utils relative_path '$1' '$2')\" = '$expected'"
>   }
>   
> +test_submodule_relative_url() {
> +	expected="$4"
> +	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" \
> +	"test \"\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3')\" = '$expected'"
> +}

relative_url() has an error exit mode that is not unlikely to trigger,
but the way this test is written such a failure would be detected
only indirectly as a mismatch between actual and expected result. The
reason is that a failed command substitution does not cause the
surrounding command to fail unless it is in the last variable
assignment. Therefore, I suggest to write the helper function like this:

test_submodule_relative_url() {
	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" "
		actual=\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3') &&
		"test \"\$actual\" = '$4'
	"
}

> +
>   test_git_path() {
>   	test_expect_success "git-path $1 $2 => $3" "
>   		$1 git rev-parse --git-path $2 >actual &&
> @@ -286,4 +292,39 @@ test_git_path GIT_COMMON_DIR=bar config                   bar/config
>   test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
>   test_git_path GIT_COMMON_DIR=bar shallow                  bar/shallow
>   
> +test_submodule_relative_url "(null)" "../foo/bar" "../bar/a/b/c" "../foo/bar/a/b/c"
> +test_submodule_relative_url "../../../" "../foo/bar" "../bar/a/b/c" "../../../../foo/bar/a/b/c"

In these two cases, it is unclear whether the "bar" in the 4th
argument is copied from the 2nd or the 3rd argument. I suggest to
use a different token:

test_submodule_relative_url "(null)" "../foo/bar" "../sub/a/b/c" "../foo/sub/a/b/c"
test_submodule_relative_url "../../../" "../foo/bar" "../sub/a/b/c" "../../../../foo/sub/a/b/c"

> +test_submodule_relative_url "(null)" "../foo/bar" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "../" "../foo/bar" "../submodule" "../../foo/submodule"
> +test_submodule_relative_url "(null)" "../foo/submodule" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "../" "../foo/submodule" "../submodule" "../../foo/submodule"
> +test_submodule_relative_url "(null)" "../foo" "../submodule" "../submodule"
> +test_submodule_relative_url "../" "../foo" "../submodule" "../../submodule"
> +test_submodule_relative_url "(null)" "./foo/bar" "../submodule" "foo/submodule"
> +test_submodule_relative_url "../" "./foo/bar" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "(null)" "./foo" "../submodule" "submodule"
> +test_submodule_relative_url "../" "./foo" "../submodule" "../submodule"
> +test_submodule_relative_url "(null)" "//somewhere else/repo" "../subrepo" "//somewhere else/subrepo"
> +test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/subsuper_update_r" "../subsubsuper_update_r" "/u//trash directory.t7406-submodule-update/subsubsuper_update_r"
> +test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/super_update_r2" "../subsuper_update_r" "/u//trash directory.t7406-submodule-update/subsuper_update_r"
> +test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm/." "../." "/u/trash directory.t3600-rm/."
> +test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm" "./." "/u/trash directory.t3600-rm/."
> +test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
> +test_submodule_relative_url "../" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic" "./å äö" "/u/trash directory.t7400-submodule-basic/å äö"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7403-submodule-sync/." "../submodule" "/u/trash directory.t7403-submodule-sync/submodule"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7407-submodule-foreach/submodule" "../submodule" "/u/trash directory.t7407-submodule-foreach/submodule"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7409-submodule-detached-worktree/home2/../remote" "../bundle1" "/u/trash directory.t7409-submodule-detached-worktree/home2/../bundle1"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7613-merge-submodule/submodule_update_repo" "./." "/u/trash directory.t7613-merge-submodule/submodule_update_repo/."
> +test_submodule_relative_url "(null)" "file:///tmp/repo" "../subrepo" "file:///tmp/subrepo"
> +test_submodule_relative_url "(null)" "foo/bar" "../submodule" "foo/submodule"
> +test_submodule_relative_url "../" "foo/bar" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "(null)" "foo" "../submodule" "submodule"
> +test_submodule_relative_url "../" "foo" "../submodule" "../submodule"
> +test_submodule_relative_url "(null)" "helper:://hostname/repo" "../subrepo" "helper:://hostname/subrepo"
> +test_submodule_relative_url "(null)" "ssh://hostname/repo" "../subrepo" "ssh://hostname/subrepo"
> +test_submodule_relative_url "(null)" "ssh://hostname:22/repo" "../subrepo" "ssh://hostname:22/subrepo"
> +test_submodule_relative_url "(null)" "user@host:path/to/repo" "../subrepo" "user@host:path/to/subrepo"
> +test_submodule_relative_url "(null)" "user@host:repo" "../subrepo" "user@host:subrepo"
> +
>   test_done
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]