Jeff King <peff@xxxxxxxx> wrote: > On Mon, Aug 27, 2007 at 11:54:30PM -0400, Shawn O. Pearce wrote: > > > This would be much faster if it was in Perl/Python/Tcl as the script > > could avoid two forks per file and instead just fork git-config > > once/twice and git-fast-import once. I think those two per-file > > forks is what is killing the performance. > > It's a bit faster, but you still get killed on passing all of the data > through userspace and a pipe, rather than just having git-add hash it > directly. Right. Now that fast-import has a 'progress' command in its stream language it may be possible to let it read from raw files in the UNIX filesystem. I almost implemented this a while ago but realized it wasn't very useful because the frontend couldn't tell when fast-import was done reading from the file. With 'progress' coming back on stdout that is now possible. > Some timings importing git.git's contents: > > git-import-core > real 0m0.839s > user 0m0.504s > sys 0m0.304s > > git-import-shell > real 0m4.947s > user 0m2.604s > sys 0m2.912s > > git-import-perl > real 0m1.400s > user 0m1.144s > sys 0m0.180s That doesn't surprise me. It would be very hard to beat `git-add .`. Where fast-import usually wins is avoiding many fork+exec if you are doing many individual commits. For just a single commit it is almost not worth starting the fast-import process. Start doing 5 or so commits, especially across different branches, and suddenly fast-import is incredibly fast. Another place where fast-import can be a big win over git-add is importing a huge number of loose objects and then doing a `repack -a -d -f`. Even though the fast-import packfile is suboptimal we can usually get to the data in the packfile faster than we can get to it from loose object files. So adding in the repack time might show fast-import is slightly faster in some coditions. I certainly like the idea of having this simple script available in a few different languages, for comparsion, and to help others get started using fast-import. -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html