I am sorry for being late with review of this patch. On Sat, 26 Sep 2009, Mark Rada wrote: > Teach gitweb how to produce nicer snapshot names by only using the > short hash id. A few questions (which I think should be answered in this commit message). First, what was original behaviour of 'snapshot' action? Did gitweb always convert 'h' (hash) parameter to full SHA-1? Second, do you preserve 'snapshot' behavior that it generated archive which unpacks to the directory with the same name as basename of (proposed) archive name? I mean here that "repo-2cc6859e.tar.gz" unpacks to "repo-2cc6859e/" directory. I guess it does, but this should be mentioned in the commit message. > If clients make requests using a tree-ish that is not a > partial or full SHA-1 hash, then the short hash will also be appended > to whatever they asked for. Here example would be a good idea. I guess this means that if one is requesting for snapshot of 'next' or 'v1.6.0' of 'repo.git' project, one would get 'repo-next-2cc6859.tar.gz' or 'repo-v1.6.0-2cc6859.tar.gz' as [proposed] snapshot filename. > > This also includes tests cases for t9502-gitweb-standalone-parse-output. I am not sure if it shouldn't be rather t9502-gitweb-standalone-snapshot test; see comments below for whys. > > Signed-off-by: Mark Rada <marada@xxxxxxxxxxxx> > --- > > > Changes since v4: > - moved git_get_full_hash into this commit > - changed test case format, suggested by Junio > - explicity request at least a length of 7 for short hashes > > > gitweb/gitweb.perl | 40 +++++++++++++++-- > t/t9502-gitweb-standalone-parse-output.sh | 67 +++++++++++++++++++++++++++++ > 2 files changed, 103 insertions(+), 4 deletions(-) > create mode 100644 t/t9502-gitweb-standalone-parse-output.sh > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 8d4a2ae..bc132a5 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -1983,14 +1983,39 @@ sub quote_command { > > # get HEAD ref of given project as hash > sub git_get_head_hash { > + return git_get_full_hash(shift, 'HEAD'); > +} Good. This means that we don't need to change code that do not use new feature (new subroutines). It's nice to cater to backward compatibility, especially if it is so low cost as this one. > + > +sub git_get_full_hash { > my $project = shift; > + my $hash = shift; I think it might be good idea here to default to 'HEAD', i.e. + my $hash = shift || 'HEAD'; > my $o_git_dir = $git_dir; > my $retval = undef; > $git_dir = "$projectroot/$project"; > - if (open my $fd, "-|", git_cmd(), "rev-parse", "--verify", "HEAD") { > - my $head = <$fd>; > + if (open my $fd, '-|', git_cmd(), 'rev-parse', '--verify', $hash) { > + $hash = <$fd>; > close $fd; > - if (defined $head && $head =~ /^([0-9a-fA-F]{40})$/) { > + if (defined $hash && $hash =~ /^([0-9a-fA-F]{40})$/) { > + $retval = $1; > + } I guess that you use "$retval = $1;" instead of just "$retval = $hash;" because of similarities with git_get_short_hash, isn't it? Or it is just following earlier code? > + } > + if (defined $o_git_dir) { > + $git_dir = $o_git_dir; > + } > + return $retval; > +} > + > +# try and get a shorter hash id > +sub git_get_short_hash { > + my $project = shift; > + my $hash = shift; > + my $o_git_dir = $git_dir; > + my $retval = undef; > + $git_dir = "$projectroot/$project"; > + if (open my $fd, '-|', git_cmd(), 'rev-parse', '--short=7', $hash) { > + $hash = <$fd>; > + close $fd; > + if (defined $hash && $hash =~ /^([0-9a-fA-F]{7,})$/) { > $retval = $1; > } > } Note that git_get_full_hash (which additionally does verification) and git_get_short_hash share much of code. Perhaps it might be worth to avoid code duplication somehow? On the other hand it might be not worth to complicate code by trying to extract common parts here... > @@ -5203,6 +5228,13 @@ sub git_snapshot { > die_error(400, 'Object is not a tree-ish'); > } > > + > + my $full_hash = git_get_full_hash($project, $hash); > + if ($full_hash =~ /^$hash/) { > + $hash = git_get_short_hash($project, $hash); > + } else { > + $hash .= '-' . git_get_short_hash($project, $hash); > + } I think we might want to avoid calling git_get_full_hash (and extra call to "git rev-parse" command, which is extra fork) if we know in advance that $full_hash =~ /^$hash/ can't be true, i.e. if $hash doesn't match /^[0-9a-fA-F]+$/. That would require that we continue to use $hash and not $full_hash, see comment for the chunk below. BTW do you think that having better name (nicer name in the case when $hash is full SHA-1, or name which describes exact version as in the case when $hash is branch name or just 'HEAD') is worth slight extra cost of "git rev-parse --abbrev=7"? > my $name = $project; > $name =~ s,([^/])/*\.git$,$1,; > $name = basename($name); > @@ -5213,7 +5245,7 @@ sub git_snapshot { > $cmd = quote_command( > git_cmd(), 'archive', > "--format=$known_snapshot_formats{$format}{'format'}", > - "--prefix=$name/", $hash); > + "--prefix=$name/", $full_hash); Why this change? > if (exists $known_snapshot_formats{$format}{'compressor'}) { > $cmd .= ' | ' . quote_command(@{$known_snapshot_formats{$format}{'compressor'}}); > } > diff --git a/t/t9502-gitweb-standalone-parse-output.sh b/t/t9502-gitweb-standalone-parse-output.sh > new file mode 100644 > index 0000000..5f2b1d5 > --- /dev/null > +++ b/t/t9502-gitweb-standalone-parse-output.sh > @@ -0,0 +1,67 @@ > +#!/bin/sh > +# > +# Copyright (c) 2009 Mark Rada > +# > + > +test_description='gitweb as standalone script (parsing script output). > + > +This test runs gitweb (git web interface) as a CGI script from the > +commandline, and checks that it produces the correct output, either > +in the HTTP header or the actual script output.' Currently all tests here are about 'snapshot' action. They are quite specific, and they do require some knowledge about chosen archive format. I think it would be better to put snapshot test into separate test, i.e. in 't/t9502-gitweb-standalone-snapshot.sh'. > + > + > +. ./gitweb-lib.sh > + > +# ---------------------------------------------------------------------- > +# snapshot file name > + > +test_commit \ > + 'SnapshotFileTests' \ > + 'i can has snapshot?' Errr... with filename [cutely] called 'i can has snapshot?' you would have, I guess, problems with tests on MS Windows, where IIRC '?' is forbidden in filenames. Perhaps +test_commit \ + 'Initial commit' \ + 'foo' or +test_commit \ + 'SnapshotFileTests' \ + 'foo' 'i can has snapshot?' > + In the test below you use "git rev-parse --verify HEAD" and "git rev-parse --short HEAD" over and over. I think it would be better to calculate them upfront: +test_expect_success 'calculate full and short ids' ' + FULLID= $(git rev-parse --verify HEAD) && + SHORTID=$(git rev-parse --short=7 HEAD) +' > +test_expect_success 'snapshots: give full hash' ' You test here that giving full has works, and that gitweb uses short hash in file name. Better name would therefore be something like +test_expect_success 'snapshots: give full hash, get short hash' ' > + ID=`git rev-parse --verify HEAD` && > + gitweb_run "p=.git;a=snapshot;h=$ID;sf=tgz" && Here I had to remember that 'tgz' snapshot format is enabled by default. I think it could be better to explicitly enable it in preparation step. > + ID=`git rev-parse --short HEAD` && > + grep ".git-$ID.tar.gz" gitweb.output Here had to think a bit that gitweb.output consists both of HTTP headers, and of response body, and you are grepping here in the HTTP headers part. It would be better solution for gitweb_run to split gitweb.output into gitweb.headers and gitweb.body (perhaps if requested by setting some variable, e.g. GITWEB_SPLIT_OUTPUT). It can be done using the following lines: sed -e '/^\r$/' <gitweb.output >gitweb.headers sed -n -e '0,/^\r$/!p' <gitweb.output >gitweb.body # gitweb.headers is used to parse http headers # gitweb.body is response without http headers But the second one uses GNU sed extension; I don't know how to write it in more portable way. Then > + grep ".git-$ID.tar.gz" gitweb.output would be + grep ".git-$ID.tar.gz" gitweb.headers Note that this would mean that t/t9501-gitweb-standalone-http-status.sh should also be updated to use gitweb.headers and gitweb.body > +' > +test_debug 'cat gitweb.output' Not a good idea in current state. gitweb.output contains binary part, and generally it is not a good idea to output binary files (which can contain ANSI escape sequences) to terminal. > + > +test_expect_success 'snapshots: give short hash' ' +test_expect_success 'snapshots: give short hash, get short hash' ' > + ID=`git rev-parse --short HEAD` && Gitweb uses '--short=7'. Shouldn't you use the same option here? > + gitweb_run "p=.git;a=snapshot;h=$ID;sf=tgz" && > + grep ".git-$ID.tar.gz" gitweb.output > +' > +test_debug 'cat gitweb.output' > + > +test_expect_success 'snapshots: give almost full hash' ' +test_expect_success 'snapshots: give almost full hash, get short hash' ' > + ID=`git rev-parse --short=30 HEAD` && > + gitweb_run "p=.git;a=snapshot;h=$ID;sf=tgz" && > + ID=`git rev-parse --short HEAD` && > + grep ".git-$ID.tar.gz" gitweb.output > +' > +test_debug 'cat gitweb.output' > + > +test_expect_success 'snapshots: give HEAD tree-ish' ' +test_expect_success 'snapshots: give HEAD, get HEAD-<short hash>' ' > + gitweb_run "p=.git;a=snapshot;h=HEAD;sf=tgz" && > + ID=`git rev-parse --short HEAD` && > + grep ".git-HEAD-$ID.tar.gz" gitweb.output > +' > +test_debug 'cat gitweb.output' > + > +test_expect_success 'snapshots: give branch name tree-ish' ' +test_expect_success 'snapshots: give branch name, get <branch name>-<short hash>' ' > + gitweb_run "p=.git;a=snapshot;h=master;sf=tgz" && > + ID=`git rev-parse --short master` && > + grep ".git-master-$ID.tar.gz" gitweb.output > +' > +test_debug 'cat gitweb.output' > + > +test_expect_success 'snapshots: give tag tree-ish' ' +test_expect_success 'snapshots: give tag name, get <tag name>-<short hash>' ' > + gitweb_run "p=.git;a=snapshot;h=SnapshotFileTests;sf=tgz" && > + ID=`git rev-parse --short SnapshotFileTests` && > + grep ".git-SnapshotFileTests-$ID.tar.gz" gitweb.output > +' > +test_debug 'cat gitweb.output' Note that to avoid ambiguities currently gitweb uses refs/heads/master and refs/tags/SnapshotFileTests... but dealing with this issue should be left, I think, for separate commit. > + > + > +test_done > -- > 1.6.4.GIT > > > -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html