Re: [PATCH] Documentation: clarify multiple pushurls vs urls

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 06 Feb 2023 21:11:51 +0100

On Mon, Feb 06 2023, Calvin Wan wrote:

> While it is possible to define multiple `url` fields in a remote to
> push to multiple remotes at once, it is preferable to achieve this by
> defining multiple `pushurl` fields.

The idea with "url" and "pushurl" surely is to disambiguate whether you
want the url for both push & fetch, or just push.

I don't see why it's a given that it's preferrable to use "pushurl" over
"url" yet, if your setup is e.g. 3 repos you push to, but it won't
matter what you pull from why not use "url"? As opposed to pushing
"pushurl" to push to read-only mirrors you yourself are updating?

But let's read on...

> Defining multiple `url` fields can cause confusion for users since
> running `git config remote.<remote>.url` returns the last defined url
> which doesn't align with the url `git fetch <remote>` uses (the first).

I'm certainly confused, I had no idea it worked this way, I'd have thought it was last-set-wins like most things.

>From a glance fb0cc87ec0f (Allow programs to not depend on remotes
having urls, 2009-11-18) mentions it as a known factor, but with:

	diff --git a/transport.c b/transport.c
	index 77a61a9d7bb..06159c4184e 100644
	--- a/transport.c
	+++ b/transport.c
	@@ -1115,7 +1115,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
	 	helper = remote->foreign_vcs;

	 	if (!url && remote->url)
	-		url = remote->url[0];
	+		url = remote->url[remote->url_nr - 1];
	 	ret->url = url;

	 	/* maybe it is a foreign URL? */

All tests pass for me, and it's selecting the last URL now. I can't find
any other mention of these semantics in the docs (but maybe I didn't
look in the right places).

So is this just some accident, does anyone rely on it, and would we be
better off just "fixing" this, rather than steering people away from
"url"?

> Add documentation to clarify how fetch interacts with multiple urls
> and the recommended method to push to multiple remotes.
>
> Signed-off-by: Calvin Wan <calvinwan@xxxxxxxxxx>
> ---
>  Documentation/urls-remotes.txt | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/urls-remotes.txt b/Documentation/urls-remotes.txt
> index 86d0008f94..61aaded645 100644
> --- a/Documentation/urls-remotes.txt
> +++ b/Documentation/urls-remotes.txt
> @@ -33,7 +33,10 @@ config file would appear like this:
>  ------------
>  
>  The `<pushurl>` is used for pushes only. It is optional and defaults
> -to `<URL>`.
> +to `<URL>`. Additional pushurls can be defined to push to multiple
> +remotes. While multiple URLs can be defined to achieve the same
> +outcome, this is not recommended since fetch only uses the first
> +defined URL.

Maybe it's just me, but I feel more confused reading this docs than
before :)

Surely if there's confusion about the priority of the *.url config
variable we should be documenting that explicitly where we discuss "url"
itself (e.g. in Documentation/config/remote.txt). Just mentioning it in
passing as we document "pushUrl" feels like the wrong place.

But I still don't quite see the premise. "git push" has a feature to
push to all N urls, whether that's Url or pushUrl.

When I configure it to have multiple URLs it pushes to the first
configured one first, if the source of the confusion was that it didn't
prefer the last configured one first, shouldn't it be doing them in
reverse order?

I don't think that would make sense, but I also don't see how
recommending "pushurl" over "url" un-confuses things.

So why is it confusing that "fetch" would use the same order, but due to
the semantics of a "fetch" we'd stop after the first one?