[I Cc-ed everybody who *might* be interested in this series. I am sorry if it included somebody by mistake] This 22+ patches long series (2 last patches are proof of concept) is intended as replacement (rewrite) of "Gitweb caching v7" series from John 'Warthog9' Hawley (J.H.): http://thread.gmane.org/gmane.comp.version-control.git/160147 This is sixth version (6th release) of this series, and is available in the following repositories (links are to web interface): http://repo.or.cz/w/git/jnareb-git.git http://github.com/jnareb/git as 'gitweb/cache-kernel-v6' branch. Earlier versions are available at http://repo.or.cz/w/git/jnareb-git.git as 'gitweb/cache-kernel-v5' (previous version) to 'gitweb/cache-kernel' (first version). Previous version of this series was sent to git mailing list as: [PATCHv5 00/17] gitweb: Simple file based output caching Message-Id: <1286402526-13143-1-git-send-email-jnareb@xxxxxxxxx> http://thread.gmane.org/gmane.comp.version-control.git/158313 You can find link to next to previous version, et cetera. The main ideas in lifted from J.H. patches are the following (features in common with "Gitweb caching v7" series by John Hawley): * caching captured output of gitweb in flat files, without any serialization (caching raw data) * using global (per-cache, not per-entry) expiration time, and using difference between mtime of cache file and current time for expiration * using file locking (flock) to prevent 'cache miss stampede' problem, i.e. to ensure that only one process is (re)generating cache entry * serving stale but not too old version, and regenerating data in background, to avoid waiting for data to be regenerated * progress info indicator based on http-equiv refresh trick (described in more detail how it works in the commit message) * capturing gitweb output by redirecting STDOUT to cache entry file The main differences between this patch series and "Gitweb caching v7" (and my minimal fixups in "Gitweb caching v7.[1-3]") are the following: * features are added piece by piece in multiple patches (22 patches covering v7 features vs 3-4 patches in v7/v7.x series), making it hopefully easier to review, as patches are smaller. OTOH this series is much longer... * In J.H. series subroutines responsible for capturing gitweb output are in gitweb.perl, and subroutines responsible for caching are in lib/cache.pl (cache.pm in original patch). cache.pl/cache.pm uses variables and subroutines from gitweb script, so it couldn't be made into Perl module; therefore we have to use 'do' rather than 'require' to load it. In this series GitwebCache::Capture::Simple module is responsible for capturing [gitweb] output, GitwebCache::SimpleFileCache and GitwebCache::FileCacheWithLocking are responsible for caching, and GitwebCache::CacheOutput is about caching captured output (ties them together). This allowed "unit" testing, i.e. testing each module in isolation (tests t9503 - t9505). * GitwebCache::CacheOutput::cache_output (equivalent of cache_fetch from cache.pm in J.H. patch) supports any cache supporting ->get / ->set or ->compute interface (e.g. Cache::FileCache from Cache::Cache, or CHI with 'File' driver, or Cache::FastMmap) - it is described in gitweb/README in "Gitweb caching" section. For this capturing engine (GitwebCache::Capture::Simple) supports returning captured output (via capturing to in-memory file). Tested once upon a time with Cache::FileCache $cache. * There is no difference between treating actions with binary output or possibly binary output like 'snapshot' or 'blob_plain' (which use binary or ':raw' mode) and other actions (which use text or ':utf8' mode). GitwebCache::Capture::Simple captures transformed output i.e. raw bytes, so data from cache is dumped to STDOUT (to web browser) in ':raw' mode. * Instead of disabling caching of 'blame_incremental' action (so it is used without caching), this alternate to plain 'blame' action is disabled if caching is turned off. In the future 'blame_interactive' would use cache for caching its initial output and for caching 'blame_data' it uses. * Configuring cache is done via %cache_options (and %generating_options) instead of via gitweb config variables. For example instead of $minCacheTime there is $cache_options{'expires_min'}. It is also more configurable than in J.H. patch; more parameters can be changed (like e.g. factor multiplying get_loadavg() in adaptive cache lifetime; 'check_load', 'generating_info', 'on_error' are configurable callbacks). "gitweb: Support legacy options used by kernel.org caching engine" patch in this series makes this rewrite support configuration variables used by "Gitweb caching v7" series. * This rewrite uses lexical filehandles, i.e. open my $fh, '>', $filename instead of globals that J.H. patch uses open FH, '>', $filename (though it hides it in "open(cacheFile, '<', $filename)"). J.H. is working on "Gitweb caching v8" and I think he would address that issue there. * When generating cache in background process, the background process daemonizes itself. Therefore it should be safe to enable / use 'background_cache' also for persistent environments, like mod_perl via ModPerl::Registry, FastCGI when run as gitweb.fcgi, PSGI via gitweb.psgi wrapper that git-instaweb generates. - Other changes might be mentioned in comments to individual patches Two last patches in this series introduce proof of concept cache administration page, where you can currently check how much file space is used by cache, and where you can also safely clean cache (remove all entries). Those two patches are slightly outside scope of "gitweb output caching", and that is why I refer to this series as 22+ patches long (there are 24 patches in total). Previous version of this series had gitweb/lib - Benchmarking GitwebCache::SimpleFileCache (in t/9603/) gitweb/lib - Alternate ways of capturing output as two last patche in the series. They are missing in this release. The following changes since commit 0b0cd0e0a29a139f418991dd769ea4266ffec370: Merge branch 'jn/ignore-doc' (2010-12-03 16:13:06 -0800) are available in the git repository at: git://repo.or.cz/git/jnareb-git.git gitweb/cache-kernel-v6 Jakub Narebski (24): t/test-lib.sh: Export also GIT_BUILD_DIR in test_external gitweb: Prepare for splitting gitweb gitweb/lib - Very simple file based cache gitweb/lib - Stat-based cache expiration gitweb/lib - Regenerate entry if the cache file has size of 0 gitweb/lib - Simple output capture by redirecting STDOUT gitweb/lib - Cache captured output (using get/set) gitweb: Add optional output caching gitweb/lib - Adaptive cache expiration time gitweb/lib - Use CHI compatibile (compute method) caching interface gitweb/lib - capture output directly to cache entry file gitweb/lib - Use locking to avoid 'cache miss stampede' problem gitweb/lib - No need for File::Temp when locking gitweb/lib - Serve stale data when waiting for filling cache gitweb/lib - Regenerate (refresh) cache in background gitweb: Introduce %actions_info, gathering information about actions gitweb: Show appropriate "Generating..." page when regenerating cache gitweb/lib - Configure running 'generating_info' when generating data gitweb: Add startup delay to activity indicator for cache gitweb/lib - Add support for setting error handler in cache gitweb: Wrap die_error to use as error handler for caching engine gitweb: Support legacy options used by kernel.org caching engine gitweb/lib - Add clear() and size() methods to caching interface gitweb: Add beginnings of cache administration page (proof of concept) gitweb/Makefile | 23 +- gitweb/README | 62 +++ gitweb/gitweb.perl | 544 +++++++++++++++++++- gitweb/lib/GitwebCache/CacheOutput.pm | 131 +++++ gitweb/lib/GitwebCache/Capture/Simple.pm | 110 ++++ gitweb/lib/GitwebCache/FileCacheWithLocking.pm | 376 ++++++++++++++ gitweb/lib/GitwebCache/SimpleFileCache.pm | 592 ++++++++++++++++++++++ t/gitweb-lib.sh | 12 + t/t9500-gitweb-standalone-no-errors.sh | 20 + t/t9501-gitweb-standalone-http-status.sh | 21 + t/t9502-gitweb-standalone-parse-output.sh | 33 ++ t/t9503-gitweb-caching-interface.sh | 34 ++ t/t9503/test_cache_interface.pl | 647 ++++++++++++++++++++++++ t/t9504-gitweb-capture-interface.sh | 34 ++ t/t9504/test_capture_interface.pl | 108 ++++ t/t9505-gitweb-cache.sh | 39 ++ t/t9505/test_cache_output.pl | 86 ++++ t/test-lib.sh | 4 +- 18 files changed, 2850 insertions(+), 26 deletions(-) create mode 100644 gitweb/lib/GitwebCache/CacheOutput.pm create mode 100644 gitweb/lib/GitwebCache/Capture/Simple.pm create mode 100644 gitweb/lib/GitwebCache/FileCacheWithLocking.pm create mode 100644 gitweb/lib/GitwebCache/SimpleFileCache.pm create mode 100755 t/t9503-gitweb-caching-interface.sh create mode 100755 t/t9503/test_cache_interface.pl create mode 100755 t/t9504-gitweb-capture-interface.sh create mode 100755 t/t9504/test_capture_interface.pl create mode 100755 t/t9505-gitweb-cache.sh create mode 100755 t/t9505/test_cache_output.pl -- 1.7.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html