Re: [PATCH] gitweb: return correct HTTP status codes

Jakub Narebski <jnareb@xxxxxxxxx> · Wed, 18 Jun 2008 09:35:53 +0200

Lea Wiemann wrote:
> Jakub Narebski wrote:
>> Lea Wiemann wrote:
>>>
>>> $hash = get_hash($symbol, 'commit'); # 'commit' to resolve tags
>> 
>> Errr... is there equivalent to ^{}, i.e. resolve to non-tag?
> 
> Yup.  Haven't quite decided whether to simply use "$symbol^{type}" or 
> make type a separate parameter.

Although I think this won't be necessary for gitweb, the ability
should (I guess) be in generic object-oriented interface like
Git::Repo module.

>> Note that you would have to examine gitweb sources to check if it
>> uses href(..., -replay=>1) when it should,
> 
> Good point, will do.

It should, but I might have missed something.

>> BTW. one of earliest idea was to fully resolve hashes, add missing
>> parameters if possible (like 'h', 'hp', 'f') and convert hashes to
>> sha-1.  One of intended uses was (weak) ETag for simple HTTP caching.
> 
> Interesting.  Something to keep in mind is that using name-rev still can 
> wreck with this since it has the unique property of taking hashes but 
> still depending on the current refs.  Gitweb isn't using name-rev a lot 
> right now, but that might change of course (e.g. I think that it would 
> be convenient to always display names along with any commit hashes).

Well, you will have two level cache: first from refnames[*1*] to hashes
(by the way I think you can treat tag names as valid indifinitely,
and use them in the place of full sha-1 hashes; thanks to heads<->tags
ambiguity gitweb now spells tags in full as refs/heads/<tag>), second
from "normalized" URLs to content, or rather from "normalized" URL
derivative to data used to generate content.

So I think you can use first-level cache to calculate equivalent
of 'name-rev', or keep cached 'name-rev' there.

Currently gitweb uses name-rev only in (I think) rarely used 'raw'
(text/plain) version of 'commit' and 'commitdiff' views... and I think
it is here to stay thanks to gitk like displaying refs marks in
log-like views (I think result of git_get_references() should also
be cached...).

[*1*] Well, project name (repository path) + refname.

We will see if this two-tier cache solution is good idea...

>> All the time I think that caching _everything_ is a bad solution.
> 
> So?  We can easily add an option to the cache; e.g. no_cache => 
> ['get_blob', 'ls_tree'].  I doubt that it will be needed, but if it 
> does, it's easy to add it.  Don't worry about it, really.

Well, on the other hand from what I understand kernel.org gitweb caches
unconditionally (but adaptively) every page, same with CGit (used for
example on freedesktop.org).

It would be nice to have some stats of gitweb access (e.g. from
repo.or.cz and kernel.org).

>> CHI (or other in recommended thread) for inobtrusive data caching
> 
> Thanks for the pointer!  On the one hand CHI is very recent and not even 
> in Debian, on the other hand it provides things like busy_lock on top of 
> Memcached (AFAICS), at fairly little cost.  I'll look into it.

I was not thinking about using CHI, or any Perl module metioned in

  "[RFD] Gitweb caching, part 3: examining Perl modules for caching (long)"
  http://thread.gmane.org/gmane.comp.version-control.git/77529/focus=77527

(perhaps with exception of Cache::Cache and its submodules/engines), but
rather about emulating best of its interfaces (well, perhaps also
"borrowing" some of its code, if license permits).

-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html