Adam Roben <aroben@xxxxxxxxx> wrote: > We were spending a lot of time forking/execing git-cat-file and > git-hash-object. We now use command_bidi_pipe to keep one instance of each > running and feed it input on stdin. Nice job! I just got access to a very fast SVN repository for a project I'm working on (not working on git-svn itself, unfortunately). A few comments and small nitpicks below: > Signed-off-by: Adam Roben <aroben@xxxxxxxxx> > --- > git-svn.perl | 94 ++++++++++++++++++++++++++++++++++++++++++++------------- > 1 files changed, 72 insertions(+), 22 deletions(-) > +package Git::Commands; Can this be a separate file, or a part of Git.pm? I'm sure other scripts can eventually use this and I've been meaning to split git-svn.perl into separate files so it's easier to follow. > +use vars qw/$_cat_blob_pid $_cat_blob_in $_cat_blob_out $_cat_blob_ctx $_cat_blob_separator > + $_hash_object_pid $_hash_object_in $_hash_object_out $_hash_object_ctx/; I have trouble following long lines, and most of the git code also wraps at 80-columns. Dead-tree publishers got this concept right a long time ago :) > +use strict; > +use warnings; > +use File::Temp qw/tempfile/; > +use Git qw/command_bidi_pipe command_close_bidi_pipe/; > + > +sub _open_cat_blob_if_needed { > + return if defined($_cat_blob_pid); > + $_cat_blob_separator = "--------------GITCATFILESEPARATOR-----------"; Brian brought this up already, but yes, having pre-defined separators instead of explicitly-specified sizes makes it all too easy for a malicious user to commit code that will break things for git-svn users. > +sub hash_object { > + my (undef, $fh) = @_; > + > + my ($tmp_fh, $tmp_filename) = tempfile(UNLINK => 1); > + while (my $line = <$fh>) { > + print $tmp_fh $line; > + } > + close($tmp_fh); Related to the above. It's better to sysread()/syswrite() or read()/print() in a loop with a predefined buffer size rather than to use a readline() since you could be dealing with files with very long lines or binaries with no newline characters in them at all. > + _open_hash_object_if_needed(); > + print $_hash_object_out $tmp_filename . "\n"; Minor, but print $_hash_object_out $tmp_filename, "\n"; avoids creating a new string. -- Eric Wong - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html