Re: irc usage..

"Martin Langhoff" <martin.langhoff@xxxxxxxxx> · Mon, 5 Jun 2006 14:06:59 +1200

On 6/5/06, Alec Warner <antarus@xxxxxxxxxx> wrote:
Ok the box this was running on had issues, so I switched to using
pearl.amd64.dev.gentoo.org, a dual core amd64 X2 4600+ with 4 gigs of
ram and plenty of disk.  The "problem" now is just converstion time...30
hours and I'm into 2004-09-17...but it's been in 2004 all day, seems
like most of the commits are in the last three years.  Are there
architectural issues with doing this in parallel?

I don't think you can do this in parallel. What I would do is remove
the -a from the git-repack invocation. It does hurt import times quite
a bit -- just do a git-repack -a -d when it's done.

And... having said that, there is still a memory leak somehow,
somewhere. It's been evading me for 2 weeks now, so I feel an idiot
now. Not too bad in general, but it shows clearly in the gentoo and
mozilla imports.

Since the repository commits are all in cvs, it should be possible to do
the work in parallel, since you know what all the commits touch.  The
concern would be ordering of nodes in the tree; you'd end up building a
bunch of subtrees and patching them together?

Well... parsecvs does a bit of this but in sequential fashion... it
imports all the files first, and then runs through the history
building the tree+commits in order, committing them. It saves a lot of
time in the file imports by parsing the RCS file directly. The
downside is that it must keep a filename+version=>sha1 mapping --
which I think is why parsecvs won't fit in memory until it's changed
to store it on disk somehow ;-)

You are forced to do it in a sequence because cvsps only tells you
about the files added/removed/changed in a commit -- you need the
ancestor to have a view of what the whole tree looked like. The only
room for parallelism I see is to fork off new processes to work on
branches in parallel.

martin
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html