John, could you please in the future Cc me? I am interested in gitweb output caching development. Thanks in advance. "John 'Warthog9' Hawley" <warthog9@xxxxxxxxxxxxxx> writes: > Afternoon everyone, > > (Afternoon is like morning, right?) > > This is the latest incarnation of gitweb w/ caching. Per the general > consensus and requests from the recent GitTogether I'm re-submitting > my patches. > > Bunch of re-works in the code, and several requested features. Sadly the > patch series has balloned as I've been adding things. It was 3-4 patches, > it's now 18. This is based on top of Jakub's v7.2 patch series, but > it should be more or less clean now. Could you please rebase it on top of v7.2 version? The v7.2 patch series contained a few bugs that needs to be corrected. > > As such there was a bunch of changes that I needed to do to Jakub's tree > which are indicated in the series. Why did I do them up as separate things? > Mainly there's a bunch of history that's getting lost right now between > going back and forth, and I wanted to have clear patches to discuss > should further discussion be needed. I guess that in the final submission (i.e. the one that is to be merged in into git.git repository) those changes would be squashed in, isn't it? > > This still differs, by two patches, from whats in production on kernel.org. > It's missing the index page git:// link, and kernel.org and kernel.org also > has the forced version matching. As a note I'll probably let this stew > another day or so on kernel.org and then I'll push it into the Fedora update > stream, as there's a couple of things in this patch series that would be > good for them to have. There was some discussion about git:// link in the past; nevertheless this issue is independent on gitweb caching and can (and should) be sent as a aeparate patch. IIRC we agreed that because of backward compatibility forced versions match is quite useless (in general)... > > There is one additional script I've written that the Fedora folks are using, > and that might be useful to include, which is an 'offline' cache file generator. > It basically wraps gitweb.cgi and at the end moves the cache file into the right > place. The Fedora folks were finding it took hours to generate their front > page, and that doing a background generation almost never completed (due to > process death). This was a simple way to handle that. If people would like > I can add it in as an additional patch. Are you detaching the background process? It would be nice to have it as separate patch. > > v8: > - Reverting several changes from Jakub's change set that make no sense > - is_cacheable changed to always return true - nothing special about > blame or blame_incremental as far as the caching engine is concerned 'blame_incremental' is just another version of 'blame' view. I have disabled it when caching is enabled in my rewrite (you instead disabled caching for 'blame_incremental' in your v7 and mine v7.x) because I couldn't get it to work together with caching. Did you check that it works? Besides, withou "tee"-ing, i.e. printing output as it is captured, cached 'blame_data' means that 'blame_incremental' is not incremental, and therefore it vanishes its advantage over 'blame'. In the case data is in cache, then 'blame_inremental' doesn't have advantage over 'blame' either. > - Reverted config file change "caching_enabled" back to "cache_enable" as this > config file option is already in the wild in production code, as are all > current gitweb-caching configuration variables. [Explitive deleted.] I dislike strongly this $cache_enable. I think it would be better for backward compatibility (should we keep backward compatibility with out-of-tree patches?) to use the same mechanism as provided in [PATCHv6/RFC 22/24] gitweb: Support legacy options used by kernel.org caching engine http://thread.gmane.org/gmane.comp.version-control.git/163052/focus=163058 http://repo.or.cz/w/git/jnareb-git.git/commitdiff/27ec67ad90ecd56ac3d05f6a9ea49b6faabf7d0a in my rewrite. Just set $caching_enabled to true if $cache_enable is defined and true. > - Reverted change to reset_output as > open STDOUT, ">&", \*STDOUT_REAL; > causes assertion failures: > Assertion !((((s->var)->sv_flags & (0x00004000|0x00008000)) == 0x00008000) && (((svtype)((s->var)->sv_flags & 0xff)) == SVt_PVGV || ((svtype)((s->var)->sv_flags & 0xff)) == SVt_PVLV)) failed: file "scalar.xs", line 49 at gitweb.cgi line 1221. > if we encounter an error *BEFORE* we've ever changed the output. Which Perl version are you using? Because I think you found error in Perl. Well, at least I have not happen on this bug. I have nothing againts using open STDOUT, ">&STDOUT_REAL"; though I really prefer that you used lexical filehandles, instead of "globs" which are global variables. The following works: open STDOUT, '>&', fileno($fh); Note that fileno(Symbol::qualify_to_ref($fh)) might be needed... > - Cleanups there were indirectly mentioned by Jakub > - Elimination of anything even remotely looking like duplicate code > - Creation of isBinaryAction() and isFeedAction() Could you please do not use mixedCase names? First, that is what %actions_info from [PATCH 16/24] gitweb: Introduce %actions_info, gathering information about actions http://thread.gmane.org/gmane.comp.version-control.git/163052/focus=163038 http://repo.or.cz/w/git/jnareb-git.git/commitdiff/305a10339b33d56b4a50708d71e8f42453c8cb1f I have invented for. Second, why 'isBinaryAction()'? there isn't something inherently different between binary (':raw') and text (':utf8') output, as I have repeatedly said before. See my rewrite: there is no special case for binary output (or perhaps binary output as in the case of 'blob_plain' action). > - Adding in blacklist of "dumb" clients for purposes of downloading content > - Added more explicit disablement of "Generating..." page Good, I'll check this. > - Added better error handling > - Creation of .err file in the cache directory > - Trap STDERR output into $output_err as this was spewing data prior > to any header information being sent Why it is needed? We capture output of "die" via CGI::Util::set_message, and "warn" output is captured to web server logs... unless you explicitely use "print STDERR <sth>" -- don't do that instead. > - Added hidden field in footer for url & hash of url, which is extremely useful > for debugging Nice idea, I'll see it. Can it be disabled (information leakage)? -- Jakub Narebski Poland ShadeHawk on #git -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html