Re: Something is broken in repack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Switching to the Google perftools malloc
http://goog-perftools.sourceforge.net/

10%   30  828M
20%   15  831M
30%   10  834M
40%   50  1014M
50%   80  1086M
60%   80  1500M
70% 200  1.53G
80% 200  1.85G
90% 260  1.87G
95% 520  1.97G
100% 1335 2.24G

Google allocator knocked 600MB off from memory use.
Memory consumption did not fall during the write out phase like it did with gcc.

Since all of this is with the same code except for changing the
threading split, those runs where memory consumption went to 4.5GB
with the gcc allocator must have triggered an extreme problem with
fragmentation.

Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of
being faster are not true.

So why does our threaded code take 20 CPU minutes longer (12%) to run
than the same code with a single thread? Clock time is obviously
faster. Are the threads working too close to each other in memory and
bouncing cache lines between the cores? Q6600 is just two E6600s in
the same package, the caches are not shared.

Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc)
with 4 threads? But only need 950MB with one thread? Where's the extra
gigabyte going?

Is there another allocator to try? One that combines Google's
efficiency with gcc's speed?


On 12/11/07, Jon Smirl <jonsmirl@xxxxxxxxx> wrote:
> I added the gcc people to the CC, it's their repository. Maybe they
> can help up sort this out.
>
> On 12/11/07, Jon Smirl <jonsmirl@xxxxxxxxx> wrote:
> > On 12/10/07, Nicolas Pitre <nico@xxxxxxx> wrote:
> > > On Mon, 10 Dec 2007, Jon Smirl wrote:
> > >
> > > > New run using same configuration. With the addition of the more
> > > > efficient load balancing patches and delta cache accounting.
> > > >
> > > > Seconds are wall clock time. They are lower since the patch made
> > > > threading better at using all four cores. I am stuck at 380-390% CPU
> > > > utilization for the git process.
> > > >
> > > > complete seconds RAM
> > > > 10%   60    900M (includes counting)
> > > > 20%   15    900M
> > > > 30%   15    900M
> > > > 40%   50    1.2G
> > > > 50%   80    1.3G
> > > > 60%   70    1.7G
> > > > 70%   140  1.8G
> > > > 80%   180  2.0G
> > > > 90%   280  2.2G
> > > > 95%   530  2.8G - 1,420 total to here, previous was 1,983
> > > > 100% 1390 2.85G
> > > > During the writing phase RAM fell to 1.6G
> > > > What is being freed in the writing phase??
> > >
> > > The cached delta results, but you put a cap of 256MB for them.
> > >
> > > Could you try again with that cache disabled entirely, with
> > > pack.deltacachesize = 1 (don't use 0 as that means unbounded).
> > >
> > > And then, while still keeping the delta cache disabled, could you try
> > > with pack.threads = 2, and pack.threads = 1 ?
> > >
> > > I'm sorry to ask you to do this but I don't have enough ram to even
> > > complete a repack with threads=2 so I'm reattempting single threaded at
> > > the moment.  But I really wonder if the threading has such an effect on
> > > memory usage.
> >
> > I already have a threads = 1 running with this config. Binary and
> > config were same from threads=4 run.
> >
> > 10% 28min 950M
> > 40% 135min 950M
> > 50% 157min 900M
> > 60% 160min 830M
> > 100% 170min 830M
> >
> > Something is hurting bad with threads. 170 CPU minutes with one
> > thread, versus 195 CPU minutes with four threads.
> >
> > Is there a different memory allocator that can be used when
> > multithreaded on gcc? This whole problem may be coming from the memory
> > allocation function. git is hardly interacting at all on the thread
> > level so it's likely a problem in the C run-time.
> >
> > [core]
> >         repositoryformatversion = 0
> >         filemode = true
> >         bare = false
> >         logallrefupdates = true
> > [pack]
> >         threads = 1
> >         deltacachesize = 256M
> >         windowmemory = 256M
> >         deltacachelimit = 0
> > [remote "origin"]
> >         url = git://git.infradead.org/gcc.git
> >         fetch = +refs/heads/*:refs/remotes/origin/*
> > [branch "trunk"]
> >         remote = origin
> >         merge = refs/heads/trunk
> >
> >
> >
> >
> > >
> > >
> > >
> > > >
> > > > I have no explanation for the change in RAM usage. Two guesses come to
> > > > mind. Memory fragmentation. Or the change in the way the work was
> > > > split up altered RAM usage.
> > > >
> > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70%
> > > > efficient. During the compress phase all four cores were active until
> > > > the last 90 seconds. Writing the objects took over 23 minutes CPU
> > > > bound on one core.
> > > >
> > > > New pack file is: 270,594,853
> > > > Old one was: 344,543,752
> > > > It still has 828,660 objects
> > >
> > > You mean the pack for the gcc repo is now less than 300MB?  Wow.
> > >
> > >
> > > Nicolas
> > >
> >
> >
> > --
> > Jon Smirl
> > jonsmirl@xxxxxxxxx
> >
>
>
> --
> Jon Smirl
> jonsmirl@xxxxxxxxx
>


-- 
Jon Smirl
jonsmirl@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux