Mike Hommey <mh@xxxxxxxxxxxx> wrote: > Hi, Hi Mike, > Being a pervert abusing the way subversion doesn't deal with branches > and tags, I'm actually not a user of git-svn or git-svnimport, because > they just can't deal easily with my perversion. So I'm writing a script > to do the conversion for me, and since I also like to learn new things > when I'm coding, I'm writing it in ruby. > > Anyways, one of the things I'm trying to convert is my svk repository > for debian packaging of xulrunner (so, a significant subset of the > mozilla tree), which doesn't involve a lot of revisions (around 280, > because I only imported releases or CVS snapshots), but involves a lot > of files (roughly 20k). > > The first thing I noticed when twisting around the svk repo so that > git-svn could somehow import it a while ago, is that running git-svn > was in my case significantly slower than svnadmin dump | svnadmin load > (more than 2 times slower). > > And now, with my own script, I got the same kind of "slowdown". So I > investigated it, and it didn't take long to realize that replacing > git-hash-object by a simple reimplementation in ruby was *way* faster. > git-hash-object being more than probably what you do the most when you > import a remote repository, it is not much of a surprise that forking > thousands of times is a huge performance waste. I haven't looked at the times in a while, but I suspect that exec() is the (much bigger) culprit. Since I usually import off remote repositories, so I notice network latency way before I notice local performance problems with git-svn. > So, just for the record, I did a lame hack of git-svn to see what kind > of speedup could happen in git-svn. You can find this lame hack as a > patch below. I did some tests (with a 1.5.2.1 release) and here are the > results, importing only the trunk (192 revisions), with no checkout, and > redirecting stdout to /dev/null: > > original git-svn: > real 25m1.871s > user 8m51.593s > sys 12m31.659s > > patched git-svn: > real 14m45.870s > user 7m31.928s > sys 4m1.047s That's awesome. > - It might be worth testing if git-cat-file is called a lot. If so, > implementing a simple git-cat-file equivalent that would work for > unpacked objects could improve speed. IIRC git-cat-file is called a lot. Every modified file needs the original cat-ed to make use of the delta. > The same things obviously apply to git-cvsimport and other scripts > calling git-hash-object a lot. Making git-svn use fast-import would be very nice. I've got a bunch of other git-svn things that I need to work on, but having git-svn converted to use fast-import would be nice. Or allowing Git.pm to access more of the git internals... However, how well/poorly would fast-import work for incremental fetches throughout the day? -- Eric Wong - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html