Re: [PATCH] gitweb: Option to omit column with time of the last change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 16 Apr 2012, Kacper Kornet wrote:
> On Mon, Apr 16, 2012 at 10:06:49PM +0200, Jakub Narebski wrote:
>> On Mon, 16 Apr 2012, Kacper Kornet wrote:
>>> On Sat, Apr 14, 2012 at 03:16:01PM +0200, Jakub Narebski wrote:

>>>>   That saves I/O, but not fork.
>> 
>> Actually if you look at the footer of projects list page with 'timed'
>> feature enable you see that for N projects on list, gitweb uses 2*N+1
>> git commands.  The "+1" part is from "git version", the "2*N" are from
>> git-for-each-ref to get last activity (and verify repository as a
>> side-effect)...
> 
> It is how I started to think about the problem. With my additional patch
> to remove the owner I am able to reduce the number of git invocations to
> 1.

That is a very good thing.

Especially that adding caching to gitweb is long in coming (to core 
gitweb at least), and that not always one is able to turn on caching.

[...]
>>> My tests show that forks are also a bottleneck in my setup.
>> 
>> What do you mean by "my tests" here?  Is it benchmark (perhaps just using
>> 'timed' feature) with and without custom change removing fork(s)?  Or did
>> you used profiler (e.g. wondefull Devel::NYTProf) for that?
> 
> Nothing fancy. I look at the footnote produced by "timed" feature. And
> I see a difference between version with the following patch:
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 18cdf96..4a13807 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -3156,6 +3156,18 @@ sub git_get_project_owner {
>  	return $owner;
>  }
>  
> +sub git_repo_exist {

Perhaps a better name would be validate_headref() or check_head(),
called from subroutine named either is_git_directory() as in setup.c,
or verify_repo()...

> +	my ($path) = @_;

BTW this could be written as

  +	my $path = shift;

though it is largely a matter of taste.

> +       my $fd;
> +
> +       $git_dir = "$projectroot/$path";
> +       open($fd, "<", "$git_dir/HEAD") or return;

It can be written as

  +	open my $fd, "<", "$projectroot/$path/HEAD"
  +		or return;

> +       my $line = <$fd>;
> +       close $fd or return;

Shouldn't we chomp($line)?

> +       return 1 if (defined $fd && substr($line, 0, 10) eq 'ref:refs/' 
> +           || $line=~m/^[0-9a-z]{40}$/);
> +       return 0;

I don't think we need to check that $fd is defined; if it isn't, we would
return earlier I think.

There can be space between "ref:" and fully qualified branch name, and in
fact git puts such space:

  $ cat .git/HEAD 
  ref: refs/heads/gitweb/web

Also, you can return boolean value.

So the above would reduce to:

  +	return ($line =~ m!^ref:\s*refs/! ||
  +	        $line =~ m!^[0-9a-z]{40}$!);

> +}
> +
>  sub git_get_last_activity {
>  	my ($path) = @_;
>  	my $fd;
> @@ -5330,6 +5342,7 @@ sub fill_project_list_info {
>  	my $show_ctags = gitweb_check_feature('ctags');
>   PROJECT:
>  	foreach my $pr (@$projlist) {
> +             next PROJECT unless git_repo_exist($pr->{'path'});

I understand that it is proof of concept patch, but I think that
in real patch iy would be better to update check_export_ok() instead
of this addition.

>  		if (project_info_needs_filling($pr, $filter_set->('age', 'age_string'))) {
>  			my (@activity) = git_get_last_activity($pr->{'path'});
>  			unless (@activity) {
> 
> 
> and the one in which  git_repo_exist() uses invocation to /bin/true:
> 
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index 18cdf96..4bcc66f 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -3156,6 +3156,13 @@ sub git_get_project_owner {
>  	return $owner;
>  }
>  
> +sub git_repo_exist {
> +	my ($path) = @_;
> +
> +        $git_dir = "$projectroot/$path";
> +        return not system('/bin/true');
> +}
> +

What were the differences in timing?

>>>                                                             On the other 
>>> hand I think that I can trust that my projects.list contains only valid
>>> repositories. So I would prefer to have a don't verify option. Or there
>>> is a possibility to write perl function with the same functionality as
>>> is_git_directory() from setup.c and use it to verify if the directory is a
>>> valid git repo.
>> 
>> Well, we can add those checks to check_export_ok()... well to function
>> it would call.
>> 
>> is_git_repository from setup.c checks that "/objects" and "/refs"
>> have executable permissions, and that "/HEAD" is valid via validate_headref
>> which does slightly more than (now slightly misnamed) check_head_link()
>> from gitweb...
>> 
>> ...or that DB_ENVIRONMENT i.e. GIT_OBJECT_DIRECTORY environment variable
>> is set, and path that it points to has executable permissions.  I don't
>> think we have to worry about this for gitweb.
> 
>> I'll try to come up with a patch to gitweb that improves repository
>> verification, and perhaps a patch that uses the fact that "git config"
>> succeeded to verify repository.
> 
> As you see it is more or less what I have already written for my tests.
> I only don't check if /objects and /refs are directories. If you want I
> can send proper patch submission for this function

I don't know how strict we should be, but "/objects" and "/refs" are just
one stat more.


Anyway, if you plan on resending this patch series, then "gitweb: Improve
repository verification" should be be first, I think.

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]