Re: [PATCH v2] builtin/push: call set_refspecs after validating remote

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 09 Jul 2024 11:44:25 -0700

Karthik Nayak <karthik.188@xxxxxxxxx> writes:

> - Added more details to the commit message to clarify the issue at hand.

> Since 9badf97c42 (remote: allow resetting url list, 2024-06-14), we
> reset the remote URL if the provided URL is empty. When a user of
> 'remotes_remote_get' tries to fetch a remote with an empty repo name,
> the function initializes the remote via 'make_remote'. But the remote is
> still not a valid remote, since the URL is empty, so it tries to add the
> URL alias using 'add_url_alias'. This in-turn will call 'add_url', but
> since the URL is empty we call 'strvec_clear' on the `remote->url`. Back
> in 'remotes_remote_get', we again check if the remote is valid, which
> fails, so we return 'NULL' for the 'struct remote *' value

That's better, but it still talks only about the implementation and
control flow description without mentioning what end-user action it
breaks.  We see in the new test this:

    $ git push "" main

but is that something we even want to support?  Before 9badf97c
(remote: allow resetting url list, 2024-06-14), the command would
have failed with different way [*1*].

Is it merely "this is a nonsense request and must fail, but we do
not want to hit BUG in general"?  I think it is the latter, but
leaving it unsaid is confusing.  How about starting it more like...

    When an end-user runs "git push" with an empty string for the
    remote repository name, e.g.

        $ git push '' main

    "git push" fails with a BUG().  This is because ...

or something.

cmd_push() calls set_refspecs() on a repo (whose name is ""), which
then calls remote.c:remote_get() on it, which calls
remote.c:make_remote() to obtain a remote structure.

But during this calling sequence especially down from remote_get(),
there are inconsistent decisions made for an empty string as a name.
It is not a NULL pointer, so it does not benefit from the default
refspecs by calling get_default=remotes_remote_for_branch,
valid_remote_nick() considers it is not a valid nick so we do not
read remotes or branches configuration file, but we still think a
name was given (even though it is an empty string) and try to do
something useful by calling add_url_alias().

It is a mess and whoever designed this should be shot [*2*] ;-).

In any case, an obvious additional fix on top of your change might
be to do something like this:

        diff --git i/remote.c w/remote.c
        index 5fa046c8f8..d7f9ba3571 100644
        --- i/remote.c
        +++ w/remote.c
        @@ -682,7 +682,7 @@ remotes_remote_get_1(
                struct remote *ret;
                int name_given = 0;

        -	if (name)
        +	if (name && *name)
                        name_given = 1;
                else
                        name = get_default(remote_state, remote_state->current_branch,

which would give us the default remote name, and we would not call
add_url_alias() with a bogus empty string to nuke the list.

> - Cleaned up 'set_refspecs' by now accepting a remote instead of trying
>   to obtain one. 

The last one, which does make sense, is not mentioned in the
proposed log message.  Lazy remote creation does not help the
updated caller because it already has one.

> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
>  TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  
> @@ -38,6 +41,13 @@ test_expect_success 'detect missing sha1 expressions early' '
>  	test_cmp expect rp-ran
>  '
>  
> +# We need to use an existing local_ref so that the remote is mapped to
> +# it in 'builtin/push.c:set_refspecs()'.

Hmph, it is OK to force 'main' like the above as a workaround, but
wouldn't it be sufficient to do

    $ git push "" HEAD:refs/heads/main

or something to avoid having to know what our current branch is?

Thanks.

[Footnote }

 *1* For example, before Peff's series, here is what I got:

         $ git push "" HEAD:master
         fatal: no path specified; see 'git help pull' for valid url syntax

     It comes from transport_push() -> get_refs_via_connect() ->
     parse_connect_url() code path.

 *2* It probably is me writing in the original in shell script, so I
     can safely say this without offending anybody.