Martin Langhoff wrote: > We can make gitweb to detect mod_perl and a few smarter things if it > is running inside of it. In fact, we can (ab)use mod_perl and perl > facilities a bit to do some serialization which will be a big win for > some pages. What we need for that is to set a sensible the ETag and > use some IPC to announce/check if other apache/modperl processes are > preparing content for the same ETag. The first-process-to-announce a > given ETag can then write it to a common temp directory (atomically - > write to a temp-name and move to the expected name) while other > processes wait, polling for the file. Once the file is in place the > latecomers can just serve the content of the file and exit. First, it would (and could) work only for serving gitweb over mod_perl. I'm not sure if overhead with IPC and complications implementing are worth it: this perhaps be better solved by caching engine. But let us put aside for a while actual caching (writing HTML version of the page to a common temp directory, and serving this static page if possible), and talk a bit what gitweb can do with respect to cache validation. In addition to setting either Expires: header or Cache-Control: max-age gitweb should also set Last-Modified: and ETag headers, and also probably respond to If-Modified-Since: and If-None-Match: requests. Would be worth implementing this? > (I am calling the "state we are serving" identifier ETag because I > think we should also set it as the ETag in the HTTP headers, so well > be able to check the ETag of future requests for staleness - all we > need is a ref lookup, and if the SHA1 matches, we are sorted). So > having this 'unique request identifier' doubles up nicely... For some pages ETag is natural; for other Last-Modified: would be more natural. > The ETag should probably be: > - SHA1+displaytype+args for pages that display an object identified > by SHA1 What uniquely identifies contents in "object" views ("commit", "tag", "tree", "blob") is either h=SHA1, or hb=SHA1;f=FILENAME (with absence of h=SHA1). If both h=SHA1 and hb=SHA1 is present, hb=SHA1 serves as backlink. The "diff" views ("commitdiff", "blobdiff") are uniquely identified by pair of object identifiers (pairs of SHA1, or pairs of hb SHA1 + FILENAME). Three of those views ("blob", "commitdiff", "blobdiff") have their "plain" version; so ETag should include displaytype (action, 'a' parameter). The hb=SHA1;f=FILENAME indentifier can be converted at cost of one call to git command (but which is a bit expensive as it recurses trees), namely to git-ls-tree. ETag can be simply args (query), if all h/hb/hbp parameters are SHA1. Or ETag can be SHA1 of an object (or pair of SHA1 in the case of diff), but this is little more costly to verify. Although we usually (always?) convert hb=SHA1;f=FILENAME to h=SHA1 anyway when displaying/generating page. Usualy you can compare ETags base on URL alone. > - refname+SHA!+displaytype+args for pages that display something > identified by a ref For objects views we can simply convert refname to SHA1. I'm not sure if it is worth it. In the cases when for view we have to calculate SHA1 of object anyway, we can return (and validate) ETag with SHA1 as above. - ETag and/or Last-Modified headers for "log" views: "log", "shortlog" (is part of summary view), "history", "rss"/"atom" views. On one hand all log views (at least now) are identified by their parameters (action/view name, and filename in the case of history view) and SHA1 of top commit. On the other hand it might be easier to use Last-Modified with date of top commit... Verifying SHA1 based ETag could add some overhead in the case of miss. > - SHA1(names and sha1s of all refs) for the summary page Wouldn't it be simplier to just set Last-Modified: header (and check it?) P.S. Can anyone post some benchmark comparing gitweb deployed under mod_perl as compared to deployed as CGI script? Does kernel.org use mod_perl, or CGI version of gitweb? -- Jakub Narebski Poland - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html