Re: remove_duplicates() in builtin/fetch-pack.c is O(N^2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday, May 22, 2012 12:21:57 pm Jeff King wrote:
> On Mon, May 21, 2012 at 11:51:16PM -0600, Martin Fick 
wrote:
> AFAIK, ref advertisement scales linearly. Which is
> probably not acceptable over a slow link, and we could
> do better.

Right, although it is not just a slow link issue.  Linear or 
not, this problem does not scale, it needs to be based on N, 
the number of refs which need updating, not N the number of 
refs in the repo.

I gave the following numbers to Junio and Shawn recently and 
I figure it is probably worth mentioning them here to 
perhaps give others some insights into just how bad this 
problem can be...

Android consists of approximately 400 projects, and if 
anyone tags their builds regularly, that means that they 
will be tagging 400 projects per build.  We have something 
like 10K tags on average across 400 projects, so when we do 
a simple No Op sync just to see if all 400 projects are up 
to date, this yields about 200MB of data over the wire (4M 
refs)!!!

That 200MB is using a -j4 on repo sync and it chews up about 
1.5 cores on a high end server to serve that 200MB for over 
1min.  Now imagine a build bot needing to build about 25 
images in parallel all just doing a No Op sync to see if 
they are up to date!  That's 25 * 200MB = 5GB data and 25 * 
1.5 cores ~= 36 cores. That means our high end 24 core 
server falls over all for a No Op.

As you can imagine, we really would like to see this 
eliminated from our workflows.  If we want to check 400 
projects to see if they are up to date, it should be 400 
refs, not 400 * 10K,

-Martin

-- 
Employee of Qualcomm Innovation Center, Inc. which is a 
member of Code Aurora Forum
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]