Re: [PATCH 1/6] gitweb: action in path with use_pathinfo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm sorry for the delay reviewing those patches.

On Sun, 21 Sep 2008, Giuseppe Bilotta wrote:

> When using path info, reduce the number of CGI parameters by embedding
> the action in the path. The typicial cgiweb path is thus
> $project/$action/$hash_base:$file_name or $project/$action/$hash

cgiweb?

> 
> This is mostly backwards compatible with the old-style gitweb paths,
> except when $project/$branch was used to access a branch whose name
> matches a gitweb action.

I would also state that gitweb _generates_ such pathinfo links if
configured (or if coming from pahinfo URL), and that this change
allow to represent wider number of gitweb links (gitweb URLs) in
pathinfo form.

Also, from what I understand, generated pathinfo links now always
use action, so they are a tiny little bit longer.

> Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@xxxxxxxxx>
> ---
>  gitweb/gitweb.perl |  109 ++++++++++++++++++++++++++++++++++------------------
>  1 files changed, 72 insertions(+), 37 deletions(-)
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index da474d0..e783d12 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -488,6 +488,37 @@ if (defined $searchtext) {
>  	$search_regexp = $search_use_regexp ? $searchtext : quotemeta $searchtext;
>  }
>  
> +# dispatch
> +my %actions = (
> +	"blame" => \&git_blame,
> +	"blobdiff" => \&git_blobdiff,
> +	"blobdiff_plain" => \&git_blobdiff_plain,
> +	"blob" => \&git_blob,
> +	"blob_plain" => \&git_blob_plain,
> +	"commitdiff" => \&git_commitdiff,
> +	"commitdiff_plain" => \&git_commitdiff_plain,
> +	"commit" => \&git_commit,
> +	"forks" => \&git_forks,
> +	"heads" => \&git_heads,
> +	"history" => \&git_history,
> +	"log" => \&git_log,
> +	"rss" => \&git_rss,
> +	"atom" => \&git_atom,
> +	"search" => \&git_search,
> +	"search_help" => \&git_search_help,
> +	"shortlog" => \&git_shortlog,
> +	"summary" => \&git_summary,
> +	"tag" => \&git_tag,
> +	"tags" => \&git_tags,
> +	"tree" => \&git_tree,
> +	"snapshot" => \&git_snapshot,
> +	"object" => \&git_object,
> +	# those below don't need $project
> +	"opml" => \&git_opml,
> +	"project_list" => \&git_project_list,
> +	"project_index" => \&git_project_index,
> +);
> +

This is as I understand simply moving %actions hash around?
Well, you could instead split hash declaration from defining it,
in the form of

   my %actions = ();
   ...
   %actions = (
        ...
   );

but I guess moving declaration earlier is good solution.

>  # now read PATH_INFO and use it as alternative to parameters
>  sub evaluate_path_info {
>  	return if defined $project;
> @@ -512,6 +543,16 @@ sub evaluate_path_info {
>  	# do not change any parameters if an action is given using the query string
>  	return if $action;
>  	$path_info =~ s,^\Q$project\E/*,,;
> +
> +	# next comes the action
> +	$action = $path_info;
> +	$action =~ s,/.*$,,;

I would use perhaps "$action = ($path_info =~ m!^([^/]+)/!;"
But that is Perl, so TIMTOWDI.

> +	if (exists $actions{$action}) {
> +		$path_info =~ s,^\Q$action\E/*,,;
> +	} else {
> +		$action  = undef;
> +	}
> +

I don't think you need quoting pattern metacharacters; valid actions
should not contain regexp metacharacters.  Defensive programming?

>  	my ($refname, $pathname) = split(/:/, $path_info, 2);
>  	if (defined $pathname) {
>  		# we got "project.git/branch:filename" or "project.git/branch:dir/"
> @@ -525,10 +566,12 @@ sub evaluate_path_info {
>  		}
>  		$hash_base ||= validate_refname($refname);
>  		$file_name ||= validate_pathname($pathname);
> +		$hash      ||= git_get_hash_by_path($hash_base, $file_name);

I don't understand why you feel that you need to do this (this is
additional git command fork, as git_get_hash_by_path calls Git, to
be more exact it calls git-ls-tree (it could call git-rev-parse
instead).  Moreover, I don't understand why you need to do this _here_,
instead of just before where you would have to have $hash variable set.

>  	} elsif (defined $refname) {
>  		# we got "project.git/branch"
> -		$action ||= "shortlog";
> -		$hash   ||= validate_refname($refname);
> +		$action    ||= "shortlog";
> +		$hash      ||= validate_refname($refname);
> +		$hash_base ||= validate_refname($refname);
>  	}
>  }
>  evaluate_path_info();
> @@ -624,8 +636,13 @@ sub href (%) {
>  	if ($params{-replay}) {
>  		while (my ($name, $symbol) = each %mapping) {
>  			if (!exists $params{$name}) {
> -				# to allow for multivalued params we use arrayref form
> -				$params{$name} = [ $cgi->param($symbol) ];
> +				if ($cgi->param($symbol)) {
> +					# to allow for multivalued params we use arrayref form
> +					$params{$name} = [ $cgi->param($symbol) ];
> +				} else {
> +					no strict 'refs';
> +					$params{$name} = $$name if $$name;
> +				}
>  			}
>  		}
>  	}

What this change is about? And why this change is _here_, in this
commit? It is I think unrelated, and wrong change. 

href(..., -replay=>1) is all about reusing current URL, perhaps with
a few parameters changed, like for example pagination links differ only
in page number param change.  For example if we had only hb= and f=
parameters, -replay=>1 links should use only those, and not add h=
parameter because somewhere we felt that we need $hash to be calculated.

> @@ -636,10 +653,28 @@ sub href (%) {

This hunk is about generating pathinfo URL, isn't it?

You probably would want to change comment at the top of this part
of href() subroutine, namely

  	if ($use_pathinfo) {
		# use PATH_INFO for project name

as you now try to use PATH_INFO for more than project name. Please do
not leave comments to get out of sync with the code.

>  		$href .= "/".esc_url($params{'project'}) if defined $params{'project'};
>  		delete $params{'project'};
>  
> -		# Summary just uses the project path URL
> -		if (defined $params{'action'} && $params{'action'} eq 'summary') {
> +		# Summary just uses the project path URL, any other action come next
> +		# in the URL
> +		if (defined $params{'action'}) {
> +			$href .= "/".esc_url($params{'action'}) unless $params{'action'} eq 'summary';
>  			delete $params{'action'};
>  		}
> +
> +		# next, we put either hash_base:file_name or hash
> +		if (defined $params{'hash_base'}) {
> +			$href .= "/".esc_url($params{'hash_base'});
> +			if (defined $params{'file_name'}) {
> +				$href .= ":".esc_url($params{'file_name'});
> +				delete $params{'hash'} if $params{'hash'} eq git_get_hash_by_path($params{'hash_base'},$params{'file_name'});

First, this page has around 140 characters. That is too long, much too long.
Please try to wrap code around 80-characters.

Second, following Petr 'Pasky' Baudis suggestion of reducing complexity
and shortening gitweb URLs, we could unconditionally remove redundant
'hash' parameter if we have both 'hash_base' and 'file_name'
parameters.  This would also simplify and speed up (lack of extra fork)
gitweb code.

> +				delete $params{'file_name'};
> +			} else {
> +				delete $params{'hash'} if $params{'hash'} eq $params{'hash_base'};
> +			}
> +			delete $params{'hash_base'};
> +		} elsif (defined $params{'hash'}) {
> +			$href .= "/".esc_url($params{'hash'});
> +			delete $params{'hash'};
> +		}

O.K.... I think.

Did you test this, preferably also using Mechanize test, with gitweb
configuration turning path_info on by default.?

>  	}
>  
>  	# now encode the parameters explicitly

Hmmm... now I am not so sure if it wouldn't be better to split this
patch in pathinfo parsing and pathinfo generation. The first part
would be obvious, the second part would be smaller and easier to review.

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux