Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 27, 2011 at 9:26 AM, Julian Phillips
<julian@xxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, 26 Sep 2011 15:52:04 -0600, Martin Fick wrote:
>>
>> On Monday, September 26, 2011 03:39:33 pm Martin Fick wrote:
>>>
>>> On Monday, September 26, 2011 02:28:53 pm Julian Phillips
>>> wrote:
>>> > >> Random thought.  What happens to the with
>>> > >> compression case if you leave the commit in, but
>>> > >> add a sleep(15) to the end of sort_refs_list?
>>> > >
>>> > > Why, what are you thinking?  Hmm, I am trying this on
>>> > > the non gced repo and it doesn't seem to be
>>> > > completing (no cpu usage)!  It appears that perhaps
>>> > > it is being called many times (the sleeping would
>>> > > explain no cpu usage)?!?  This could be a real
>>> > > problem, this should only get called once right?
>>> >
>>> > I was just wondering if the time taken to get the refs
>>> > was changing the interaction with something else.  Not
>>> > very likely, but ...
>>> >
>>> > I added a print statement, and it was called four times
>>> > when I had unpacked refs, and once with packed.  So,
>>> > maybe you are hitting some nasty case with unpacked
>>> > refs.  If you use a print statement instead of a sleep,
>>> > how many times does sort_refs_lists get called in your
>>> > unpacked case?  It may well also be worth calculating
>>> > the time taken to do the sort.
>>>
>>> In my case it was called 18785 times!  Any other tests I
>>> should run?
>>
>> Gerrit stores the changes in directories under refs/changes
>> named after the last 2 digits of the change.  Then under
>> each change it stores each patchset.  So it looks like this:
>> refs/changes/dd/change_num/ps_num
>>
>> I noticed that:
>>
>>  ls refs/changes/* | wc -l
>>  -> 18876
>>
>> somewhat close, but not super close to 18785,  I am not sure
>> if that is a clue.  It's almost like each change is causing
>> a re-sort,
>
> basically, it is ...
>
> Back when I made that change, I failed to notice that get_ref_dir was recursive for subdirectories ... sorry ...
>
> Hopefully this should speed things up.  My test repo went from ~17m user time, to ~2.5s.
> Packing still make things much faster of course.
>
> diff --git a/refs.c b/refs.c
> index a615043..212e7ec 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -319,7 +319,7 @@ static struct ref_list *get_ref_dir(const char *submodule, c
>                free(ref);
>                closedir(dir);
>        }
> -       return sort_ref_list(list);
> +       return list;
>  }
>
>  struct warn_if_dangling_data {
> @@ -361,11 +361,13 @@ static struct ref_list *get_loose_refs(const char *submodu
>        if (submodule) {
>                free_ref_list(submodule_refs.loose);
>                submodule_refs.loose = get_ref_dir(submodule, "refs", NULL);
> +               submodule_refs.loose = sort_refs_list(submodule_refs.loose);
>                return submodule_refs.loose;
>        }
>
>        if (!cached_refs.did_loose) {
>                cached_refs.loose = get_ref_dir(NULL, "refs", NULL);
> +               cached_refs.loose = sort_refs_list(cached_refs.loose);
>                cached_refs.did_loose = 1;
>        }
>        return cached_refs.loose;
>
>
>
>>
>>
>> -Martin
>
> --
> Julian
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Well done! I'll try to compose a patch attributed to Julian with the
information from this thread.

--
David Barr
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]