Re: [PATCH] use a hashmap to make remotes faster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 29, 2014 at 09:57:45AM +0200, Matthieu Moy wrote:

> "patrick.reynolds@xxxxxxxxxx" <patrick.reynolds@xxxxxxxxxx> writes:
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> It seems you mixed your name and email address in your config file. I
> guess your name is "Patrick Reynolds", not
> "patrick.reynolds@xxxxxxxxxx".

Also, Patrick, please sign-off your patch ("format-patch -s").

> > Remotes are stored as an array, so looking one up or adding one without
> > duplication is an O(n) operation.  Reading an entire config file full of
> > remotes is O(n^2) in the number of remotes.  For a repository with tens of
> > thousands of remotes, the running time can hit multiple minutes.
> 
> Just being curious: in which senario do you have tens of thousands of
> remotes?
> 
> (not an objection, it's a good thing anyway)

Whenever you fork a repository at GitHub, we give you a leaf repository
that points its info/alternates to a master "network.git" repository for
the fork network.  The network.git repo contains all of the objects, and
has a remote configured for each of the child repositories. You would
never want to gc in that repository without doing a "fetch --all" first.

Most networks have only a few dozen forks, but a few have a large number
(torvalds/linux has ~5K, and homebrew is close to 10K).  And then
sometimes a MOOC instructor tells an entire 50K-person class to fork a
hello-world project all at once. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]