On Fri, 2006-06-16 at 13:44 -0400, Jon Smirl wrote: > I've been extracting versions from cvs and adding them to git now for > 2.5 days and the process still isn't finished. It is completely CPU > bound. It's just a loop of cvs co, add it to git, make tree, commit, > etc. To do all of mozilla using parsecvs (even with the quadratic algorithm) takes about three hours on annarchy.freedesktop.org (two dual-core Opteron with 4GB memory), including all conversion to packs. The pack time is a tiny fraction of that. > What about the cvs2svn algorithm described in the attachment? A ram > based version could be faster. Compression could be acheived by > switching from using the full path to a version to the sha1 for it. Yes, parsecvs currently keeps everything in memory when doing the tree conversion, which means it grows to a huge size to compute the full tree of revisions. Computing git tree objects from the top down, then computing commit objects from the bottom up should allow us to free most of that during the full branch history computation process. I'm starting a rewrite of parsecvs to try this approach and see how well it works. If you've looked at the parsecvs source code, you'll notice it's a mess at present; I started by attempting to do pair-wise tree merges in a mistaken attempt to convert a linear term to log. Hacking that code into its present form should be viewed more as a demonstration of how the overall process can work, not as an optimal expression of the algorithm. -- keith.packard@xxxxxxxxx
Attachment:
signature.asc
Description: This is a digitally signed message part