Sounds good. Submitted a patch at: https://lore.kernel.org/git/pull.1782.v4.git.1728073292874.gitgitgadget@xxxxxxxxx/ Which seems to be detached from this thread now — perhaps because I re-wrote the commit message. Or perhaps I'm reading the threads wrong. Apologies for the break. On Tue, Sep 24, 2024 at 2:54 AM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Shubham Kanodia via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > From: Shubham Kanodia <shubham.kanodia10@xxxxxxxxx> > > > > This commit introduces a new configuration option, ... > > The usual way to compose a log message of this project is to > > - Give an observation on how the current system work in the present > tense (so no need to say "Currently X is Y", just "X is Y"), and > discuss what you perceive as a problem in it. > > - Propose a solution (optional---often, problem description > trivially leads to an obvious solution in reader's minds). > > - Give commands to the codebase to "become like so". > > in this order. > > To those who have been intimately following the discussion, it often > is understandable without some of the above, but we are not writing > for those who review the patches. We are primarily writing for future > readers of "git log" who are not aware of the review discussion we > have on list, so we should give something to prepare them by setting > the stage and stating the objective first, before going into how the > patch solved it. > > > diff --git a/Documentation/config/remote.txt b/Documentation/config/remote.txt > > index 8efc53e836d..b04ee0c4c22 100644 > > --- a/Documentation/config/remote.txt > > +++ b/Documentation/config/remote.txt > > @@ -33,6 +33,11 @@ remote.<name>.fetch:: > > The default set of "refspec" for linkgit:git-fetch[1]. See > > linkgit:git-fetch[1]. > > > > +remote.<name>.prefetchref:: > > + Specify the refs to be prefetched when fetching from this remote. > > + The value is a space-separated list of ref patterns (e.g., "refs/heads/master !refs/heads/develop*"). > > + This can be used to optimize fetch operations by specifying exactly which refs should be prefetched. > > Overly long lines. > > > diff --git a/builtin/fetch.c b/builtin/fetch.c > > index b2b5aee5bf2..16c8a31c2e1 100644 > > --- a/builtin/fetch.c > > +++ b/builtin/fetch.c > > @@ -38,6 +38,7 @@ > > #include "trace.h" > > #include "trace2.h" > > #include "bundle-uri.h" > > +#include "wildmatch.h" > > The match done by the wildmatch function is a way too loose to be > used in the context of parsing and matching the src half of an > refspec_item (e.g. a globbing refspec element can have at most one > asterisk in it, but wildmatch does not have any sanity check to > ensure it). The refspec logic to accept a fetch refspec > "refs/heads/*:refs/remotes/origin/*" and rewrite a ref the remote > side advertised as refs/heads/foo into refs/remotes/origin/foo, the > remote-tracking ref on the local side, is in > remote.c:match_name_with_pattern(), and it does not use wildmatch. > > > +static int matches_prefetch_refs(const char *refname, const struct string_list *prefetch_refs) > > +{ > > + int i; > > + int has_positive = 0; > > + int matched_positive = 0; > > + int matched_negative = 0; > > + > > + for (i = 0; i < prefetch_refs->nr; i++) { > > Just FYI, in modern Git codebase, loop control variable 'i' can be > declared in the initialization part of the for() statement, i.e. > > for (int i = 0; i < ...) { > > which might make it easier to see that there are three variables of > importance to determine the outcome of the computation (and 'i' is > merely a loop counter). > > > + const char *pattern = prefetch_refs->items[i].string; > > + int is_negative = (*pattern == '!'); > > + > > + if (is_negative) > > + pattern++; > > + else > > + has_positive = 1; > > + > > + if (wildmatch(pattern, refname, 0) == 0) { > > + if (is_negative) > > + matched_negative = 1; > > + else > > + matched_positive = 1; > > + } > > + } > > I suspect that the function can be exposed to external callers as-is, > possibly after renaming it to make it clear that the helper function > is about refspec. Then you can call it from the code below in stead > of wildmatch(), by passing the pattern as "key", and the refname as > "name", with NULL for both "value" and "result", I think. > > > + > > + if (!has_positive) > > + return !matched_negative; > > + > > + return matched_positive && !matched_negative; > > +} > > And the implementation of the logic looks correct. If the pattern > says "!A !B !C" (I don't want any of these), we only care that none > of these negative patterns trigger, but if the pattern has even one > positive one "X Y !A !B !C" (it must match X or Y, but it must not > match any of the A B C), we also make sure it matched at least one > positive pattern. > > > +static void ref_remove(struct ref **head, struct ref *to_remove) > > +{ > > + struct ref **pp, *p; > > + > > + for (pp = head; (p = *pp) != NULL; pp = &p->next) { > > + if (p == to_remove) { > > + *pp = p->next; > > + return; > > + } > > + } > > +} > > We remote from a linked list of "struct ref" one such element > to_remove. OK. > > > @@ -610,6 +655,22 @@ static struct ref *get_ref_map(struct remote *remote, > > else > > ref_map = apply_negative_refspecs(ref_map, &remote->fetch); > > > > + /** > > + * Filter out advertised refs that we don't want to fetch during > > + * prefetch if a prefetchref config is set > > + */ > > + if (prefetch && remote->prefetch_refs.nr) { > > + struct ref *ref, *next; > > + for (ref = ref_map; ref; ref = next) { > > + next = ref->next; > > + > > + if (!matches_prefetch_refs(ref->name, &remote->prefetch_refs)) { > > + ref_remove(&ref_map, ref); > > + free_one_ref(ref); > > It is curious why such an overly deep indentation is used only in > this if() statement. > > It is good that you made sure we do not leak the ref that we > removed, but it looks verly inefficient to repeatedly call > ref_remove (the function becomes more inefficient as the loop > progresses as it is more costly to remove later elements on a > linearly linked list). > > Wouldn't it be more efficient to iterate over the original list, sift > each element into "surviving" and "discarded" bin as we go? Something > along the lines of ... > > struct ref *new_list = NULL, **tail = &new_list; > > while (ref_map) { > struct ref *ref = ref_map; > > ref_map = ref_map->next; > ref->next = NULL; > > if (matches(...)) { > *tail = ref; > tail = &ref->next; > } else { > free_one_ref(ref); > } > } > ref_map = new_list; > > ..., which I imitated code structure of ref_remove_duplicates(). > > > ref_map = ref_remove_duplicates(ref_map); > > > diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh > > index abae7a97546..054f1f06f95 100755 > > --- a/t/t7900-maintenance.sh > > +++ b/t/t7900-maintenance.sh > > Avoid overly long lines in the test script, too. > > In any case, I very much agree with you that get_ref_map().is the > best place to add this logic to. Looking much better. > > Thanks.