Re: [PATCH 1/2] Add fetch.updateHead option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 5, 2023 at 4:28 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote:
>
>
> On Tue, Apr 04 2023, Felipe Contreras wrote:
>
> > Users might change the behavior when running "git fetch" so that the
> > remote's HEAD symbolic ref is updated at certain point.
> >
> > For example after running "git remote add" the remote HEAD is not
> > set like it is with "git clone".
> >
> > Setting "fetch.updatehead = missing" would probably be a sensible
> > default that everyone would want, but for now the default behavior is to
> > never update HEAD, so there shouldn't be any functional changes.
> >
> > For the next major version of Git, we might want to change this default.
> >
> > Signed-off-by: Felipe Contreras <felipe.contreras@xxxxxxxxx>
> > ---
> >  Documentation/config/fetch.txt  |  4 +++
> >  Documentation/config/remote.txt |  3 ++
> >  builtin/fetch.c                 | 64 ++++++++++++++++++++++++++++++++-
> >  remote.c                        | 21 +++++++++++
> >  remote.h                        | 11 ++++++
> >  t/t5510-fetch.sh                | 31 ++++++++++++++++
> >  6 files changed, 133 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/config/fetch.txt b/Documentation/config/fetch.txt
> > index 568f0f75b3..dc147ffb35 100644
> > --- a/Documentation/config/fetch.txt
> > +++ b/Documentation/config/fetch.txt
> > @@ -120,3 +120,7 @@ fetch.bundleCreationToken::
> >  The creation token values are chosen by the provider serving the specific
> >  bundle URI. If you modify the URI at `fetch.bundleURI`, then be sure to
> >  remove the value for the `fetch.bundleCreationToken` value before fetching.
> > +
> > +fetch.updateHead::
> > +     Defines when to update the remote HEAD symbolic ref. Values are 'never',
> > +     'missing' (update only when HEAD is missing), and 'always'.
> > diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt
> > index 0678b4bcfe..9d739d2ed4 100644
> > --- a/Documentation/config/remote.txt
> > +++ b/Documentation/config/remote.txt
> > @@ -86,3 +86,6 @@ remote.<name>.partialclonefilter::
> >       Changing or clearing this value will only affect fetches for new commits.
> >       To fetch associated objects for commits already present in the local object
> >       database, use the `--refetch` option of linkgit:git-fetch[1].
> > +
> > +remote.<name>.updateHead::
> > +     Defines when to update the remote HEAD symbolic ref. See `fetch.updateHead`.
> > diff --git a/builtin/fetch.c b/builtin/fetch.c
> > index 7221e57f35..7e93a1aa46 100644
> > --- a/builtin/fetch.c
> > +++ b/builtin/fetch.c
> > @@ -59,6 +59,8 @@ static int fetch_prune_tags_config = -1; /* unspecified */
> >  static int prune_tags = -1; /* unspecified */
> >  #define PRUNE_TAGS_BY_DEFAULT 0 /* do we prune tags by default? */
> >
> > +static int fetch_update_head = FETCH_UPDATE_HEAD_DEFAULT;
> > +
> >  static int all, append, dry_run, force, keep, multiple, update_head_ok;
> >  static int write_fetch_head = 1;
> >  static int verbosity, deepen_relative, set_upstream, refetch;
> > @@ -129,6 +131,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
> >               return 0;
> >       }
> >
> > +     if (!strcmp(k, "fetch.updatehead"))
> > +             return parse_update_head(&fetch_update_head, k, v);
> > +
> >       return git_default_config(k, v, cb);
> >  }
> >
> > @@ -1579,6 +1584,47 @@ static int backfill_tags(struct transport *transport,
> >       return retcode;
> >  }
> >
> > +static void update_head(int config, const struct ref *head, const struct remote *remote)
>
> Here you pass a "const struct remote".
>
> > +{
> > +     char *ref, *target;
> > +     const char *r;
> > +     int flags;
> > +
> > +     if (!head || !head->symref || !remote)
> > +             return;
> > +
> > +     ref = apply_refspecs((struct refspec *)&remote->fetch, "refs/heads/HEAD");
> > +     target = apply_refspecs((struct refspec *)&remote->fetch, head->symref);
>
> But here we end up with this cast, as it's not const after all, we're
> modifying it.

It is a const, and we are not modifying it. `apply_refspecs()` is not
saying what it should say: the refspec remains constant.

As Patrick explained: `apply_refspecs()` should probably be fixed.

> > +
> > +     if (!ref || !target) {
> > +             warning(_("could not update remote head"));
> > +             return;
> > +     }
> > +
> > +     r = resolve_ref_unsafe(ref, 0, NULL, &flags);
> > +
> > +     if (r) {
> > +             if (config == FETCH_UPDATE_HEAD_MISSING) {
> > +                     if (flags & REF_ISSYMREF)
> > +                             /* already present */
> > +                             return;
> > +             } else if (config == FETCH_UPDATE_HEAD_ALWAYS) {
> > +                     if (!strcmp(r, target))
> > +                             /* already up-to-date */
> > +                             return;
>
> I think you should name the "enum" you're adding below, the one that
> contains the new "FETCH_UPDATE_HEAD_DEFAULT".
>
> Then this could be a "switch", and the compiler could check the
> arguments, i.e. you could pass an enum type instead of an "int".

Sure, it can be an `enum fetch_update_mode` instead of `int`, but I
don't see what value it provides, other than more verbosity. The enum
right above is also unnamed, and 'remote->origin' is an int. And it's
not the only enum of that kind in the source code.

Using a switch is better, but that doesn't require an enum type. The
multiple ifs are just a remnant of a previous version of the code.

> > +             } else
>
> {} missing, if you keep this, but...
>
> > +                     /* should never happen */
> > +                     return;
>
> ...so, here we're not checking some enum values, but presumably other
> things check this, I haven't checked.

Yes, the function cannot be called otherwise.

> But for a "should never happen", should we make this a "BUG()", or is it
> user-controlled?

Sure, it can be a `BUG()`. It truly should not happen.

> > +     }
> > +
> > +     if (!create_symref(ref, target, "remote update head")) {
> > +             if (verbosity >= 0)
> > +                     printf(_("Updated remote '%s' HEAD\n"), remote->name);
> > +     } else {
> > +             warning(_("could not update remote head"));
> > +     }
> > +}
> > +
> >  static int do_fetch(struct transport *transport,
> >                   struct refspec *rs)
> >  {
> > @@ -1592,6 +1638,7 @@ static int do_fetch(struct transport *transport,
> >       int must_list_refs = 1;
> >       struct fetch_head fetch_head = { 0 };
> >       struct strbuf err = STRBUF_INIT;
> > +     int need_update_head = 0, update_head_config = 0;
> >
> >       if (tags == TAGS_DEFAULT) {
> >               if (transport->remote->fetch_tags == 2)
> > @@ -1626,9 +1673,21 @@ static int do_fetch(struct transport *transport,
> >       } else {
> >               struct branch *branch = branch_get(NULL);
> >
> > -             if (transport->remote->fetch.nr)
> > +             if (transport->remote->fetch.nr) {
> > +
> > +                     if (transport->remote->update_head)
> > +                             update_head_config = transport->remote->update_head;
> > +                     else
> > +                             update_head_config = fetch_update_head;
> > +
> > +                     need_update_head = update_head_config && update_head_config != FETCH_UPDATE_HEAD_NEVER;
> > +
> > +                     if (need_update_head)
> > +                             strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
> >                       refspec_ref_prefixes(&transport->remote->fetch,
> >                                            &transport_ls_refs_options.ref_prefixes);
> > +             }
> > +
> >               if (branch_has_merge_config(branch) &&
> >                   !strcmp(branch->remote_name, transport->remote->name)) {
> >                       int i;
> > @@ -1737,6 +1796,9 @@ static int do_fetch(struct transport *transport,
> >
> >       commit_fetch_head(&fetch_head);
> >
> > +     if (need_update_head)
> > +             update_head(update_head_config, find_ref_by_name(remote_refs, "HEAD"), transport->remote);
>
> Some overly long lines here...

Not unique in this document:

117:                                         warning(_("rejected %s
because shallow roots are not allowed to be updated"),
115:                                         warning(_("multiple
branches detected, incompatible with --set-upstream"));
112:         OPT_STRING_LIST('o', "server-option", &server_options,
N_("server-specific"), N_("option to transmit")),
111:                         need_update_head = update_head_config &&
update_head_config != FETCH_UPDATE_HEAD_NEVER;
108:                                   "you need to specify exactly
one branch with the --set-upstream option"));
106:                         die(_("options '%s' and '%s' cannot be
used together"), "--depth", "--unshallow");
106:                 update_head(update_head_config,
find_ref_by_name(remote_refs, "HEAD"), transport->remote);
103:                                 die(_("fetching a group and
specifying refspecs does not make sense"));
103:                         die(_("options '%s' and '%s' cannot be
used together"), "--deepen", "--depth");
103:                                 warning(_("not setting upstream
for a remote remote-tracking branch"));

But I seem to recall previous discussions (perhaps in LKML) where
people accepted that lines 120-characters long are OK. We don't live
in the 80's anymore, terminals have more than 80 columns.

> > +
> >       if (set_upstream) {
> >               struct branch *branch = branch_get("HEAD");
> >               struct ref *rm;
> > diff --git a/remote.c b/remote.c
> > index 641b083d90..5f3a9aa53e 100644
> > --- a/remote.c
> > +++ b/remote.c
> > @@ -344,6 +344,25 @@ static void read_branches_file(struct remote_state *remote_state,
> >       remote->fetch_tags = 1; /* always auto-follow */
> >  }
> >
> > +int parse_update_head(int *r, const char *var, const char *value)
> > +{
> > +     if (!r)
> > +             return -1;
> > +     else if (!value)
> > +             return config_error_nonbool(var);
> > +     else if (!strcmp(value, "never"))
> > +             *r = FETCH_UPDATE_HEAD_NEVER;
> > +     else if (!strcmp(value, "missing"))
> > +             *r = FETCH_UPDATE_HEAD_MISSING;
> > +     else if (!strcmp(value, "always"))
> > +             *r = FETCH_UPDATE_HEAD_ALWAYS;
>
> Ditto, this could really benefit from an enum type, instead of the bare
> "int".

What would change other than `int *r` -> `enum name *r`?

> > +     else {
> > +             error(_("malformed value for %s: %s"), var, value);
> > +             return error(_("must be one of never, missing, or always"));
>
> Shouldn't we use git_die_config() or similar here, to get the line
> number etc., or do we get that somehow (I can't recall).

There's plenty of `error()` in config.c, including
`git_default_push_config`, which this was based on.

> > +     }
> > +     return 0;
> > +}
> > +
> >  static int handle_config(const char *key, const char *value, void *cb)
> >  {
> >       const char *name;
> > @@ -473,6 +492,8 @@ static int handle_config(const char *key, const char *value, void *cb)
> >                                        key, value);
> >       } else if (!strcmp(subkey, "vcs")) {
> >               return git_config_string(&remote->foreign_vcs, key, value);
> > +     } else if (!strcmp(subkey, "updatehead")) {
> > +             return parse_update_head(&remote->update_head, key, value);
> >       }
> >       return 0;
> >  }
> > diff --git a/remote.h b/remote.h
> > index 73638cefeb..9dce42d65d 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -22,6 +22,13 @@ enum {
> >       REMOTE_BRANCHES
> >  };
> >
> > +enum {
> > +     FETCH_UPDATE_HEAD_DEFAULT = 0,
>
> We tend to only init these to 0 when the default being 0 matters,
> i.e. we use it as a boolean, but is that the case here?

In the current version of the code it doesn't matter, but the default
could be different later on.

For example if the default is not specified `!fetch_update_head` the
code could do some guessing, like doing "always" if the remote is a
mirror.

I learned this lesson reorganizing the options of `builtin/pull.c` in
a patch that was never merged. [1]

> > +
> >  #endif
> > diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh
> > index dc44da9c79..dbeb2928ae 100755
> > --- a/t/t5510-fetch.sh
> > +++ b/t/t5510-fetch.sh
> > @@ -814,6 +814,37 @@ test_expect_success 'fetch from multiple configured URLs in single remote' '
> >       git fetch multipleurls
> >  '
> >
> > +test_cmp_symbolic_ref () {
> > +     git symbolic-ref "$1" >actual &&
> > +     echo "$2" >expected &&
> > +     test_cmp expected actual
> > +}
>
> Sort of an aside, but this seems to be the Nth use of this pattern in
> the test suite, e.g. t1401-symbolic-ref.sh repeatedly hardcodes the
> same.
>
> I wonder if a prep commit to stick this in test-lib-functions.sh would
> be in order, or maybe a "--symbolic" argument to "test_cmp_rev"?

Sure. If I had incline that such a patch would be merged (or this one)
I would do it, but I have a plethora of cleanup patches just gathering
dust, so I'd rather not.

Cheers.

[1] https://lore.kernel.org/git/20210705123209.1808663-24-felipe.contreras@xxxxxxxxx/

-- 
Felipe Contreras




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux