Re: Mozilla, git and Windows

"Jon Smirl" <jonsmirl@xxxxxxxxx> · Mon, 27 Nov 2006 20:35:05 -0500

On 11/27/06, Petr Baudis <pasky@xxxxxxx> wrote:
On Mon, Nov 27, 2006 at 05:13:10PM CET, Jon Smirl wrote:
> The SVN version of the Mozilla repository is about 3GB. It takes
> around a week of CPU time for svnimport to process it.

Is there a reason why a SVN importer would _have_ to take _longer_ than
a CVS importer? I'd expect the opposite from an optimized importer since
you don't have to guess the changesets...

These import programs take forever because they fork off git, SVN or
CVS millions of times. It really does take a week to fork a CVS
process that many times. It's not the application code that is taking
a week to run, it is the millions of forks.

As was mentioned in the thread about doing CVS to git import, the
trick is to write your own CVS file parser, parse the file once (not
once for each revision) and output all of the revisions to the git
database in a single pass. When code is structured that way I can
import the whole Mozilla repository into git in two hours. The
fast-import back end also works with out forking, it just listens to
command and stdin and acts on them, all of the commands are implement
in a single binary.

The speed of fork in Linux is fine for most purposes, but it is not
fine if you are going to fork off good sized apps several million
times. When I measured those forks in oprofile, 60% of the CPU was
being consumed by the kernel.

--
Jon Smirl
jonsmirl@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html