Bruno Cesar Ribas <ribas@xxxxxxxxxxxx> writes: > I made another SIMPLE bench on gitweb. Testing time on git-for-each-ref. > > Using my 1000 projects I ran: > 8<---------------- > #/bin/bash > PEGAR_ref() { > PROJ=projeto$1.git; > cd $PROJ; > printf "\tlastref = $(git-for-each-ref --sort=-committerdate --count=1\ > --format='%(committer)')\n" >> config; > cd -; > } > cd $HOME/scm > for((i=1;i<=1000;i++)){ PEGAR_ref $i & } > 8<---------------- Could you please do not mix English and your native language (Portuguese?) in shown examples? Mixing two languages in one identifier name (unless it is ref in br too) is especially bad form... TIA. Besides, what I'm more interested in is a script used to generate those 1000 projects... > And at the "git_get_last_activity" instead of running git-for-each-ref i > asked to get gitweb.lastref > > Here are the results: > "dd" means: dd if=/dev/zero of=$HOME/dd/$i bs=1M count=400000 > > Running 2 dd to generate disk IO. Here comes the results: > NO projects_list projects_list > 7m56s55 6m11s95 cached last change, using gitweb.lastref > 16m30s69 15m10s74 default gitweb, using FS's owner > 16m07s40 15m24s34 patched to get gitweb.owner > > Now results for a 1000projects on an idle machine. (No dd running to > generate IO) > NO projects_list projects_list > 0m26s79 0m38s70 cached last change, using gitweb.lastref > 1m19s08 1m09s55 default gitweb, using FS's owner > 1m17s58 1m09s55 patched to get gitweb.owner Those are results of running gitweb as standalone script, or your script runing git-for-each-ref? Besides, I'd rather see results of running ApacheBench. On Linux it usually comes with installed Apache, and it is called by runing 'ab'. Your tests instead of adding superficial load could try to use concurrent requests, and more than 1 request to get better average. > I found out those VERY interesting, so instead of trying to think a > new way to store gitweb config, we should think a way to cache those > information. Below there are my thoughts about caching information for gitweb: First, the basis of each otimisation is checking the bottlenecks. I think it was posted sometime there that the pages taking most load are projects list and feeds. Kernel.org even run modified version of gitweb, with some caching support; Cgit (git web interface in C) also has caching support. Due to the fact that gitweb produces relative time in output for projects list page and for project summary page, it is unfortunately not easy to just simply cache HTML output: one would have either resign from using relative time, or rewrite time from relative to absolute, either on server (in gitweb), or on client (in JavaScript). So perhaps it would be better to cache generating (costly to obtain) information; like lastchanged time for projects. Or we can for example assume (i.e. do that if appropriate gitweb feature is set) that projects are bare projects pushed to, and that git-update-server-info is ran on repository update (for example for HTTP protocol transport), and stat $GIT_DIR/info/refs and/or $GIT_DIR/objects/info/packs instead of running git-for-each-ref. Of course then column would be called something like "Last Update" instead of "Last Change". The "Last Update" information is especially easy because it can be invalidated / update externally, by the update / post-receive hook, outside gitweb. So gitweb doesn't need to implement some caching invalidation mechanism for this. We can store lastref / lastchange information in repository config, as for example "gitweb.lastref" key. We can store it in gitweb wide config, for example in $projectroot/gitwebconfig file, as for example "gitweb.<project>.lastref" key. Or we can store it as hash initializer in some sourced Perl file, read from gitweb_config.perl (this I think can be done even now without touching gitweb code at all); we can use Data::Dumper to save such information. The possibilities are many. -- Jakub Narebski Poland ShadeHawk on #git - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html