Re: [PATCH 2/3] gitweb: Cache $parent_commit info in git_blame()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jakub Narebski <jnareb@xxxxxxxxx> writes:

> On Wed, 10 Dec 2008, Nanako Shiraishi wrote:
>> Quoting Jakub Narebski <jnareb@xxxxxxxxx>:
>> 
>> > Unfortunately the implementation in 244a70e used one call for
>> > git-rev-parse to find parent revision per line in file, instead of
>> > using long lived "git cat-file --batch-check" (which might not existed
>> > then), or changing validate_refname to validate_revision and made it
>> > accept <rev>^, <rev>^^, <rev>^^^ etc. syntax.
>> 
>> Could you substantiate why this is "Unfortunate"?
>
> Because it calls git-rev-parse once for _each line_, even if for lines
> in the group of neighbour lines blamed by same commit $parent_commit
> is the same, and even if you need to calculate $parent_commit only once
> per unique individual commit present in blame output.

It probably was obvious that this was meant as a patch for better
performance without changing the functionality.  I tend to think the
presentation wasn't so great, though.

    Luben Tuikov changed 'lineno' link from leading to commit which lead
    to current version of given block of lines, to leading to parent of
    this commit in 244a70e (Blame "linenr" link jumps to previous state at
    "orig_lineno").  This supposedly made data mining possible (or just
    better).

Other than "supposedly" which I do not think should be there, I think this
is a great opening paragraph to establish the context.

    Unfortunately the implementation in 244a70e used one call for
    git-rev-parse to find parent revision per line in file, instead of
    using long lived "git cat-file --batch-check" (which might not existed
    then), or changing validate_refname to validate_revision and made it
    accept <rev>^, <rev>^^, <rev>^^^ etc. syntax.

But I do not think this is such a great second paragraph that states what
problem it tries to solve.  I'd say something like this instead:

        The current implementation calls rev-parse once per line to find
        parent revision of blamed commit, even when the same commit
        appears more than once, which is inefficient.

And then the outline of the solution:

    This patch attempts to migitate issue a bit by caching $parent_commit
    info in %metainfo, which makes gitweb to call git-rev-parse only once
    per unique commit in blame output.

which is very good, except that I do not think you need to say "a bit".
And have your benchmark (two tables and footnotes) after this outline of
the solution.

I think the first part of "Unfortunately" paragraph can be dropped
(because that is already in the first half of problem description) and the
rest can come as the last paragraph as "Possible future enhancements".

> Appendix A:
> ~~~~~~~~~~~
> #!/bin/bash
>
> export GATEWAY_INTERFACE="CGI/1.1"
> export HTTP_ACCEPT="*/*"
> export REQUEST_METHOD="GET"
> export QUERY_STRING=""$1""
> export PATH_INFO=""$2""
>
> export GITWEB_CONFIG="/home/jnareb/git/gitweb/gitweb_config.perl"
>
> perl -- /home/jnareb/git/gitweb/gitweb.perl
>
> # end of gitweb-run.sh

I'd suggest making a separate patch to add "gitweb-run.sh" in contrib/ so
that other people can use it when checking their changes to gitweb.  The
script might need some more polishing, though.  For example, it is not
very obvious if you have *_config.perl only to customize for your
environment (e.g. where the test repositories are) or you need to have
some overrides in there when you are running gitweb as a standalone script.

To recap, I think the commit log for this patch would have been much
easier to read if it were presented in this order:

	a paragraph to establish the context;

	a paragraph to state what problem it tries to solve;

        a paragraph (or more) to explain the solution; and finally

	a paragraph to discuss possible future enhancements.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux