"Shubham Kanodia via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Shubham Kanodia <shubham.kanodia10@xxxxxxxxx> > > This commit introduces a new configuration option, ... The usual way to compose a log message of this project is to - Give an observation on how the current system work in the present tense (so no need to say "Currently X is Y", just "X is Y"), and discuss what you perceive as a problem in it. - Propose a solution (optional---often, problem description trivially leads to an obvious solution in reader's minds). - Give commands to the codebase to "become like so". in this order. To those who have been intimately following the discussion, it often is understandable without some of the above, but we are not writing for those who review the patches. We are primarily writing for future readers of "git log" who are not aware of the review discussion we have on list, so we should give something to prepare them by setting the stage and stating the objective first, before going into how the patch solved it. > diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt > index 8efc53e836d..b04ee0c4c22 100644 > --- a/Documentation/config/remote.txt > +++ b/Documentation/config/remote.txt > @@ -33,6 +33,11 @@ remote.<name>.fetch:: > The default set of "refspec" for linkgit:git-fetch[1]. See > linkgit:git-fetch[1]. > > +remote.<name>.prefetchref:: > + Specify the refs to be prefetched when fetching from this remote. > + The value is a space-separated list of ref patterns (e.g., "refs/heads/master !refs/heads/develop*"). > + This can be used to optimize fetch operations by specifying exactly which refs should be prefetched. Overly long lines. > diff --git a/builtin/fetch.c b/builtin/fetch.c > index b2b5aee5bf2..16c8a31c2e1 100644 > --- a/builtin/fetch.c > +++ b/builtin/fetch.c > @@ -38,6 +38,7 @@ > #include "trace.h" > #include "trace2.h" > #include "bundle-uri.h" > +#include "wildmatch.h" The match done by the wildmatch function is a way too loose to be used in the context of parsing and matching the src half of an refspec_item (e.g. a globbing refspec element can have at most one asterisk in it, but wildmatch does not have any sanity check to ensure it). The refspec logic to accept a fetch refspec "refs/heads/*:refs/remotes/origin/*" and rewrite a ref the remote side advertised as refs/heads/foo into refs/remotes/origin/foo, the remote-tracking ref on the local side, is in remote.c:match_name_with_pattern(), and it does not use wildmatch. > +static int matches_prefetch_refs(const char *refname, const struct string_list *prefetch_refs) > +{ > + int i; > + int has_positive = 0; > + int matched_positive = 0; > + int matched_negative = 0; > + > + for (i = 0; i < prefetch_refs->nr; i++) { Just FYI, in modern Git codebase, loop control variable 'i' can be declared in the initialization part of the for() statement, i.e. for (int i = 0; i < ...) { which might make it easier to see that there are three variables of importance to determine the outcome of the computation (and 'i' is merely a loop counter). > + const char *pattern = prefetch_refs->items[i].string; > + int is_negative = (*pattern == '!'); > + > + if (is_negative) > + pattern++; > + else > + has_positive = 1; > + > + if (wildmatch(pattern, refname, 0) == 0) { > + if (is_negative) > + matched_negative = 1; > + else > + matched_positive = 1; > + } > + } I suspect that the function can be exposed to external callers as-is, possibly after renaming it to make it clear that the helper function is about refspec. Then you can call it from the code below in stead of wildmatch(), by passing the pattern as "key", and the refname as "name", with NULL for both "value" and "result", I think. > + > + if (!has_positive) > + return !matched_negative; > + > + return matched_positive && !matched_negative; > +} And the implementation of the logic looks correct. If the pattern says "!A !B !C" (I don't want any of these), we only care that none of these negative patterns trigger, but if the pattern has even one positive one "X Y !A !B !C" (it must match X or Y, but it must not match any of the A B C), we also make sure it matched at least one positive pattern. > +static void ref_remove(struct ref **head, struct ref *to_remove) > +{ > + struct ref **pp, *p; > + > + for (pp = head; (p = *pp) != NULL; pp = &p->next) { > + if (p == to_remove) { > + *pp = p->next; > + return; > + } > + } > +} We remote from a linked list of "struct ref" one such element to_remove. OK. > @@ -610,6 +655,22 @@ static struct ref *get_ref_map(struct remote *remote, > else > ref_map = apply_negative_refspecs(ref_map, &remote->fetch); > > + /** > + * Filter out advertised refs that we don't want to fetch during > + * prefetch if a prefetchref config is set > + */ > + if (prefetch && remote->prefetch_refs.nr) { > + struct ref *ref, *next; > + for (ref = ref_map; ref; ref = next) { > + next = ref->next; > + > + if (!matches_prefetch_refs(ref->name, &remote->prefetch_refs)) { > + ref_remove(&ref_map, ref); > + free_one_ref(ref); It is curious why such an overly deep indentation is used only in this if() statement. It is good that you made sure we do not leak the ref that we removed, but it looks verly inefficient to repeatedly call ref_remove (the function becomes more inefficient as the loop progresses as it is more costly to remove later elements on a linearly linked list). Wouldn't it be more efficient to iterate over the original list, sift each element into "surviving" and "discarded" bin as we go? Something along the lines of ... struct ref *new_list = NULL, **tail = &new_list; while (ref_map) { struct ref *ref = ref_map; ref_map = ref_map->next; ref->next = NULL; if (matches(...)) { *tail = ref; tail = &ref->next; } else { free_one_ref(ref); } } ref_map = new_list; ..., which I imitated code structure of ref_remove_duplicates(). > ref_map = ref_remove_duplicates(ref_map); > diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh > index abae7a97546..054f1f06f95 100755 > --- a/t/t7900-maintenance.sh > +++ b/t/t7900-maintenance.sh Avoid overly long lines in the test script, too. In any case, I very much agree with you that get_ref_map().is the best place to add this logic to. Looking much better. Thanks.