On Sun, Sep 15, 2019 at 7:35 PM Derrick Stolee <stolee@xxxxxxxxx> wrote: > > On 9/15/2019 5:18 PM, Masaya Suzuki wrote: > > During git-fetch, the client checks if the advertised tags' OIDs are > > already in the fetch request's want OID set. This check is done in a > > linear scan. For a repository that has a lot of refs, repeating this > > scan takes 15+ minutes. In order to speed this up, create a oid_set for > > other refs' OIDs. > > Good catch! Quadratic performance is never good. > > The patch below looks like it works, but could you also share your > performance timings for the 15+ minute case after your patch is > applied? With the following code change, I measured the time for find_non_local_tags. It shows 215 msec with the example commands. (I didn't measure entire fetch time as good portion of the time is spent on the server side.) diff --git a/builtin/fetch.c b/builtin/fetch.c index 51a276dfaa..d3b06c733d 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -25,6 +25,7 @@ #include "list-objects-filter-options.h" #include "commit-reach.h" #include "branch.h" +#include <time.h> #define FORCED_UPDATES_DELAY_WARNING_IN_MS (10 * 1000) @@ -322,8 +323,11 @@ static void find_non_local_tags(const struct ref *refs, const struct ref *ref; struct refname_hash_entry *item = NULL; + struct timespec start, end; + refname_hash_init(&existing_refs); refname_hash_init(&remote_refs); + clock_gettime(CLOCK_MONOTONIC, &start); create_fetch_oidset(head, &fetch_oids); for_each_ref(add_one_refname, &existing_refs); @@ -405,6 +409,12 @@ static void find_non_local_tags(const struct ref *refs, } hashmap_free(&remote_refs, 1); string_list_clear(&remote_refs_list, 0); + clock_gettime(CLOCK_MONOTONIC, &end); + { + uint64_t millisec = (end.tv_sec - start.tv_sec) * 1000 + (end.tv_nsec - start.tv_nsec) / 1000000; + fprintf(stderr, "find_non_local_tags: %ld msec\n", millisec); + } + oidset_clear(&fetch_oids); }