Martin Langhoff wrote:
On 6/5/06, Alec Warner <antarus@xxxxxxxxxx> wrote:
Ok the box this was running on had issues, so I switched to using
pearl.amd64.dev.gentoo.org, a dual core amd64 X2 4600+ with 4 gigs of
ram and plenty of disk. The "problem" now is just converstion time...30
hours and I'm into 2004-09-17...but it's been in 2004 all day, seems
like most of the commits are in the last three years. Are there
architectural issues with doing this in parallel?
I don't think you can do this in parallel. What I would do is remove
the -a from the git-repack invocation. It does hurt import times quite
a bit -- just do a git-repack -a -d when it's done.
Only repack at the end then? disk space isn't an issue here so I'll give
that a shot.
And... having said that, there is still a memory leak somehow,
somewhere. It's been evading me for 2 weeks now, so I feel an idiot
now. Not too bad in general, but it shows clearly in the gentoo and
mozilla imports.
30565 antarus 17 0 470m 456m 1640 S 14 11.6 234:23.38
git-cvsimport
30566 antarus 16 0 6753m 147m 752 S 7 3.7 120:27.06 cvs
I'm on cvs-1.11.12 and the git version of git
You are forced to do it in a sequence because cvsps only tells you
about the files added/removed/changed in a commit -- you need the
ancestor to have a view of what the whole tree looked like. The only
room for parallelism I see is to fork off new processes to work on
branches in parallel.
Not helpful in the Gentoo case, since we only have one branch; minus an
accident when a dev branched gentoo-x86 a while back ;)
I'll keep chugging on this one; it won't be the final import as I
haven't used the complete Authors file, so I will try the repacking
optimization next time I do an import.
-Alec Warner
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html