I like the idea behind this patch, to enable to use path_info for as much gitweb parameters as possible. After this patch series the only parameters which wouldn't be possible to represent in path_info would be: * @extra_options ('opt') multi-valued parameter, used to pass thinks like '--no-merges', which cannot be fit in the "simplified" list-like (as opposed to hash-like query string) path_info URL. * $searchtype ('st') and $searchtext ('s') etc. parameters, which are generated by HTML form, and are naturally generated in query string format. * $page ('pg') parameter, which could theoretically be added as last part of path_info URL, for example $project/next/2/... if not for pesky $project/history/next:/Documentation/2/ where you cannot be sure that having /<number>/ at the end is rare. * $order ('o') parameter, which would be hard to fit in path_info, with its limitation of parameters being specified by position. Or even next to impossible. * 'by_tag'... But I'd rather have this patch series to be in separate thread... On Sun, 19 Oct 2008, Giuseppe Bilotta wrote: > We parse requests for $project/snapshot/$head.$sfx as equivalent to > $project/snapshot/$head?sf=$sfx, where $sfx is any of the known > (although not necessarily supported) snapshot formats (or its default > suffix). > > The filename for the resulting package preserves the requested > extensions (so asking for a .tgz gives a .tgz, and asking for a .tar.gz > gives a .tar.gz), although for obvious reasons it doesn't preserve the > basename (git/snapshot/next.tgz returns a file names git-next.tgz). That is a bit of difference from sf=<format> in CGI query string, where <format> is always a name of a format (for example 'tgz' or 'tbz2'), and actual suffix is defined in %known_snapshot_formats (for example '.tar.gz' and '.tar.bz2' respectively). Now you can specify snapshot format either either by its name, for example 'tgz' (which is simple lookup in hash) which result in proposed filename with '.tgz' suffix, or you can specify suffix, for example 'tar.gz' (which requires searching through all hash) which result in proposed filename with '.tar.gz' suffix. This is a bit of inconsistency; to be consistent with how we handle 'sf' CGI parameter we would translate 'tgz' $sfx into 'tar.gz' in snapshot filename. This would also cover currently purely theoretical case when different snapshot formats (for example 'tgz' and 'tgz9') would use the same snapshot suffix (extension), but differ for example in parameters passed to compressor (for example '-9' or '--best' in the 'tgz9' case). On the other hand one would expect that when URL which looks like URL to snapshot ends with '.$sfx', then filename for snapshot would also end with '.$sfx'. This certainly requires some further thoughts. > > This introduces a potential case for ambiguity if a project has a head > that ends with a snapshot-like suffix (.zip, .tgz, .tar.gz, etc) and the > sf CGI parameter is not present; however, gitweb only produces URLs with > the sf parameter, so this is only a potential issue for hand-coded URLs > for extremely unusual project. I think you wanted to say here "_currently_ produces URLs with the 'sf' parameter" as the next patch in series changes this. > > Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@xxxxxxxxx> > --- > > I had second thoughts on this. Now we always look for the snapshot extension if > the sf CGI parameter is missing, even if the project has a head that matches > the full pseudo-refname $head.$sfx. > > The reason for this is that (1) there is no ambiguity for gitweb-generated > URLs (2) the only URLs that could fail are hand-made URLs for extremely > unusual projects and (3) it allows us to set gitweb up to generate > (unambiguous) URLs without the sf CGI parameter. This is also simpler and cheaper solution. > > This also means that I can add 3 patches to the series, instead of just one: > * patch #6 that parses the new format > * patch #7 that generates the new URLs > * patch #8 for some code refactoring Now, I haven't yet read the last patch in series, so I don't know if it is independent refactoring, making sense even before patches named #6 and #7 here, or is it connected with searching for snapshot format by suffix it uses. If the former, it should be done upfront, as it shouldn't need discussion, and being easier to be accepted into git.git. If the latter, then it should probably be folded (squashed) into #6, first patch in the series. > > gitweb/gitweb.perl | 34 ++++++++++++++++++++++++++++++++++ > 1 files changed, 34 insertions(+), 0 deletions(-) > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > index 99c8c20..e9e9e60 100755 > --- a/gitweb/gitweb.perl > +++ b/gitweb/gitweb.perl > @@ -609,6 +609,40 @@ sub evaluate_path_info { > $input_params{'hash_parent'} ||= $parentrefname; > } > } > + > + # for the snapshot action, we allow URLs in the form > + # $project/snapshot/$hash.ext > + # where .ext determines the snapshot and gets removed from the > + # passed $refname to provide the $hash. > + # > + # To be able to tell that $refname includes the format extension, we > + # require the following two conditions to be satisfied: > + # - the hash input parameter MUST have been set from the $refname part > + # of the URL (i.e. they must be equal) This means no "$project/.tgz?h=next", isn't it? > + # - the snapshot format MUST NOT have been defined already I would add "which means that 'sf' parameter is not set in URL", or something like that as the last line of above comment. I like that the code is so well commented, by the way. > + if ($input_params{'action'} eq 'snapshot' && defined $refname && > + $refname eq $input_params{'hash'} && Minor nit. I would use here (the question of style / better readability): + if ($input_params{'action'} eq 'snapshot' && + defined $refname && $refname eq $input_params{'hash'} && to have both conditions about $refname in the same line. > + !defined $input_params{'snapshot_format'}) { > + # We loop over the known snapshot formats, checking for > + # extensions. Allowed extensions are both the defined suffix > + # (which includes the initial dot already) and the snapshot > + # format key itself, with a prepended dot > + while (my ($fmt, %opt) = each %known_snapshot_formats) { > + my $hash = $refname; > + my $sfx; > + $hash =~ s/(\Q$opt{'suffix'}\E|\Q.$fmt\E)$//; > + next unless $sfx = $1; > + # a valid suffix was found, so set the snapshot format > + # and reset the hash parameter > + $input_params{'snapshot_format'} = $fmt; > + $input_params{'hash'} = $hash; > + # we also set the format suffix to the one requested > + # in the URL: this way a request for e.g. .tgz returns > + # a .tgz instead of a .tar.gz > + $known_snapshot_formats{$fmt}{'suffix'} = $sfx; > + last; > + } I'm not sure if it worth (see comment at the beginning of this mail) adding this code, or just allow $sfx to be snapshot _name_ (key in %known_snapshot_formats hash). Otherwise it would be as simple as checking if $known_snapshot_formats{$sfx} exists (assuming that snapshot format names does not contain '.'). If we decide to go more complicated route, then refactoring it in such a way that suffixes are also keys to %known_snapshot_formats would be preferred... err, sorry, not so simple. But refactoring this check into separate subroutine (as I think last patch in series does) would be good idea. Also, I'd rather you checked if the $refname part contains '.' for it to even consider that it can be suffix. > + } > } > evaluate_path_info(); > > -- > 1.5.6.5 > > -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html