Re: [PATCH] submodule: Port resolve_relative_url from shell to C

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Beller <sbeller@xxxxxxxxxx> writes:

> Later on we want to deprecate the `git submodule init` command and make
> it implicit in other submodule commands.

I doubt there is a concensus for "deprecate" part to warrant the use
of "we want" here.  I tend to think that the latter half of the
sentence is uncontroversial, i.e. it is a good idea to make other
"submodule" subcommands internally call it when it makes sense, and
also make knobs available to other commands like "clone" and
possibly "checkout" so that the users do not have to do the
"submodule init" as a separate step, though.

> As I was porting the functionality I noticed some odds with the inputs.

I can parse but cannot quite grok.  You found some strange things in
the input?  Whose input, that comes from where given by whom?

> To fully understand the situation I added some logging to the function
> temporarily to capture all calls to the function throughout the test
> suite. Duplicates have been removed and all unique testing inputs have
> been recorded into t0060.

I can also parse this, but it is unclear what you did to the
temporary debugging help at the end.  If you left it, then that is
no longer a temporary but is part of the final product.  It is also
unclear what "Duplicates" you are talking about here.

Do you mean that you found some of the existing tests were odd, and
after examination with help from a temporary hack which does not
remain in this patch, you determined that some tests were duplicated,
which you removed, while adding new ones?

>  builtin/submodule--helper.c | 189 ++++++++++++++++++++++++++++++++++++++++++++
>  git-submodule.sh            |  81 +------------------
>  t/t0060-path-utils.sh       |  42 ++++++++++
>  3 files changed, 235 insertions(+), 77 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index f4c3eff..3e58b5d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -9,6 +9,193 @@
>  #include "submodule-config.h"
>  #include "string-list.h"
>  #include "run-command.h"
> +#include "remote.h"
> +#include "refs.h"
> +#include "connect.h"
> +
> +static char *get_default_remote(void)
> +{
> +	char *dest = NULL, *ret;
> +	unsigned char sha1[20];
> +	int flag;
> +	struct strbuf sb = STRBUF_INIT;
> +	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
> +
> +	if (!refname)
> +		die("No such ref: HEAD");
> +
> +	refname = shorten_unambiguous_ref(refname, 0);
> +	strbuf_addf(&sb, "branch.%s.remote", refname);

Is it correct to use shorten_unambiguous_ref() here like this?  The
function is meant to be used when you want heads/frotz because you
have both refs/heads/frotz and refs/tags/frotz at the same time.  I
think you want to say branch.frotz.remote even in such a case.  IOW,
shouldn't it be skip_prefix() with refs/heads/, together with die()
if the prefix is something else?

> +	if (git_config_get_string(sb.buf, &dest))
> +		ret = xstrdup("origin");
> +	else
> +		ret = xstrdup(dest);
> +
> +	strbuf_release(&sb);
> +	return ret;
> +}
> +
> +static int starts_with_dot_slash(const char *str)
> +{
> +	return str[0] == '.' && is_dir_sep(str[1]);
> +}
> +
> +static int starts_with_dot_dot_slash(const char *str)
> +{
> +	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
> +}
> +
> +static char *last_dir_separator(char *str)
> +{
> +	char* p = str + strlen(str);

Asterisk sticks to the variable, not the type.

> +	while (p-- != str)

It is preferable to use '>' not '!=' here, because you know p
approaches str from the larger side, for readability.

> +		if (is_dir_sep(*p))
> +			return p;
> +	return NULL;
> +}

(a useless comment) This is one of the rare places where I wish
there were a version of strcspn() that scans from the right.

> +static char *relative_url(const char *remote_url,
> +				const char *url,
> +				const char *up_path)
> +{
> +	int is_relative = 0;
> +	int colonsep = 0;
> +	char *out;
> +	char *remoteurl = xstrdup(remote_url);
> +	struct strbuf sb = STRBUF_INIT;
> +	size_t len;
> +
> +	len = strlen(remoteurl);

Nothing wrong here, but it looked somewhat inconsistent to see this
assignment, while remoteurl is done as an initialization [*1*]


[Footnote]

*1* as a personal preference, I tend to prefer seeing only simple
operations in initialization and heavyweight ones as a separate
assignment to an otherwise uninitialized variable, and strlen() is
lighter-weight than xstrdup() in my dictionary.



> +	if (is_dir_sep(remoteurl[len]))
> +		remoteurl[len] = '\0';
> +
> +	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
> +		is_relative = 0;
> +	else {
> +		is_relative = 1;
> +
> +		/* Prepend a './' to ensure all relative remoteurls start
> +		 * with './' or '../'. */

Adjust the style, and perhaps remove the final full-stop to make the
last string literal easier to see?  I.e.

	/*
         * Prepend a './' to ensure all relative remoteurls
         * start with './' or '../'
         */

would be easier to see what it is said.

> +		if (!starts_with_dot_slash(remoteurl) &&
> +		    !starts_with_dot_dot_slash(remoteurl)) {
> +			strbuf_reset(&sb);
> +			strbuf_addf(&sb, "./%s", remoteurl);
> +			free(remoteurl);
> +			remoteurl = strbuf_detach(&sb, NULL);
> +		}
> +	}
> +	/* When the url starts with '../', remove that and the
> +	 * last directory in remoteurl. */

Style.

> +	while (url) {
> +		if (starts_with_dot_dot_slash(url)) {
> +			char *rfind;
> +			url += 3;
> +
> +			rfind = last_dir_separator(remoteurl);
> +			if (rfind)
> +				*rfind = '\0';
> +			else {
> +				rfind = strrchr(remoteurl, ':');
> +				if (rfind) {
> +					*rfind = '\0';
> +					colonsep = 1;
> +				} else {
> +					if (is_relative || !strcmp(".", remoteurl))
> +						die(_("cannot strip one component off url '%s'"), remoteurl);
> +					else
> +						remoteurl = xstrdup(".");
> +				}
> +			}

It is somewhat hard to see how this avoids stripping one (or both)
slashes just after "http:" in remoteurl="http://site/path/";, leaving
just "http:/" (or "http:").

This codepath has overly deep nesting levels.  Is this the simplest
we can do?

The final else { if .. else } can be made into else if .. else to
dedent the overlong die() by one level, but I am wondering if the
deep nesting is just a symptom of logic being unnecessarily complex.

> +		} else if (starts_with_dot_slash(url)) {
> +			url += 2;
> +		} else
> +			break;
> +	}
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
> +
> +	if (starts_with_dot_slash(sb.buf))
> +		out = xstrdup(sb.buf + 2);
> +	else
> +		out = xstrdup(sb.buf);
> +	strbuf_reset(&sb);
> +
> +	free(remoteurl);
> +	if (!up_path || !is_relative)
> +		return out;
> +
> +	strbuf_addf(&sb, "%s%s", up_path, out);
> +	free(out);
> +	return strbuf_detach(&sb, NULL);
> +}

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]