H. Peter Anvin <hpa@xxxxxxxxx> wrote: > Jakub Narebski wrote: >> >> I don't think it can be easily expanded. .git/info/refs is meant for >> http-fetch, and it mimics git-ls-remote / git-peek-remote output. > > For heaven's sake, in computer science we can *NEVER* use the same > feature for *MORE THAN ONE THING*. If it doesn't work format-wise > that's fine, but "it's only supposed to be used by dumb transports" is > ridiculous. .git/info/refs is for dumb transports, so if we follow "do not use the same feature for more than one thing" principle we should not change its format for gitweb. .git/info/refs is one of auxiliary info files to help dumb servers, (servers that does not do on-the-fly pack generation), to help clients discover what references server has. The second auxiliary info file is .git/objects/info/packs. Both are generated by git-update-server-info command, usually run from post-update hook. Because .git/info/refs format is the same as git-ls-remote output (AFAIK smart servers use git-ls-remote or git-peek-remote; dumb servers use .git/info/refs) we used and can use it as ''cached'' "git ls-remote ." / "git peek-remote ." / "git show-ref --dereference" output. For bare repositories where new data arrives only via 'update' (via push or fetch) and always trigger post-update hook, and not for example via git-commit which does not invoke post-update hook, the information in .git/info/refs is always fresh. What I propose as quick solution is to add new (perhaps local) git-update-gitweb-info command which is to be used in post-update (and perhaps post-commit for non-bare repos) hook, and which results we would use in gitweb. See patch at the bottom. >> BTW. putting the info of git-for-each-ref into .git/info/refs-details >> would mean that instead of "24175 calls to git" one would need to >> read 24175 files. Perhaps the whole info needed to generate projects >> index page should be pre-generated on push (update), instead of per >> project (per repository) .git/info/refs-details > > No, it should be one file per repository, not one file per ref. Why? > Obviously we don't want 24175 files to be accessed. However, a push can > only affect files for which the repository owner has permission and > which resides in the repository filespace, so it should stay inside that > space. Gitweb _newer_ did one call to git _per ref_, but always one call to git _per repository_! Old git always used HEAD ref to get "Last Change" info and used one call to git-rev-list (if I remember correctly), new git checks all refs to get "Last Change" info but uses _one_ call to git-for-each-ref. Because we did not want to affect gitweb performance badly we waited for changing "Last Change" to check all refs and not only HEAD to have git-for-each-ref to use one call to git command for that. Historically it was first use of git-for-each-ref in gitweb. Sidenote: I planned to add new %feature to gitweb to allow to chose if to use all refs for "Last Change" info, HEAD ref, or some given ref (for example "master"). But that would perhaps wait for .git/config parser in Perl. > On kernel.org, this would reduce the load from 24175 calls to git to > reading 250 files. Although the latter is still expensive (and will > probably need post-generation caching) the files should be small and > cacheable by the kernel, and the resulting I/O load should be quite small. Oh, so there are around 250 projects, and around 24175 references together in those projects on kernel.org? I thought it were 24175 _projects_ (repositories)... Currently, it is 250 calls to git, reading 24175 files (unless refs are packed, then it would be reading 250 files) to get refs (heads) info, and reading around 2*250 files (packs + index) to get last change info. Not "24175 calls to git". > Anyway, as far as git-update-server-info is concerned, I'm *very* > concerned that there be a single command that updates all the cached > information across the repository. Telling everyone to update their > hooks every time we want to add cached information is silly. Right now, > git-update-server-info is the command to update cached information, and > for usability reasons there should be a single entry point. git-update-server-info is to "update auxiliary info file to help dumb servers". I propose to use (new) git-update-gitweb-info to help gitweb. One command for one feature. This would mean unfortunately adding "exec git-update-gitweb-info" line (if it does not exist) to existing projects post-update hooks; for new projects it would be I think enough to modify post-update template (templates/hooks--post-update or /usr/share/git-core/templates/hooks/post-update). Below the patches of how it can be done. Does not include corrections to Makefile to install git-update-gitweb-info. NOT TESTED! BTW final version of git-update-gitweb-info probably should be a built-in command, like git-update-server-info, not a script. diff --git a/git-update-gitweb-info.sh b/git-update-gitweb-info.sh new file mode 100755 index 0000000..5bb44df --- /dev/null +++ b/git-update-gitweb-info.sh @@ -0,0 +1,7 @@ +#!/bin/sh + +. git-sh-setup +test -w "$GIT_DIR/info/last-changed" && +git-for-each-ref \ + --format='%(committer)' --sort=-committerdate --count=1 refs/heads \ + > "$GIT_DIR/info/last-changed" diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl index 88af2e6..e7874a6 100755 --- a/gitweb/gitweb.perl +++ b/gitweb/gitweb.perl @@ -1150,12 +1150,16 @@ sub git_get_last_activity { my ($path) = @_; my $fd; - $git_dir = "$projectroot/$path"; - open($fd, "-|", git_cmd(), 'for-each-ref', - '--format=%(committer)', - '--sort=-committerdate', - '--count=1', - 'refs/heads') or return; + if (-r "$projectroot/$path/info/last-changed") { + open $fd, "$projectroot/$path/info/last-changed"; + } else { + $git_dir = "$projectroot/$path"; + open($fd, "-|", git_cmd(), 'for-each-ref', + '--format=%(committer)', + '--sort=-committerdate', + '--count=1', + 'refs/heads') or return; + } my $most_recent = <$fd>; close $fd or return; if ($most_recent =~ / (\d+) [-+][01]\d\d\d$/) { diff --git a/templates/hooks--post-update b/templates/hooks--post-update old mode 100644 new mode 100755 index bcba893..b119224 --- a/templates/hooks--post-update +++ b/templates/hooks--post-update @@ -6,3 +6,4 @@ # To enable this hook, make this file executable by "chmod +x post-update". exec git-update-server-info +exec git-update-gitweb-info -- Jakub Narebski Poland - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html