Re: [PATCH 0/9] Gitweb caching v5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/14/2010 05:40 PM, Jakub Narebski wrote:
> "John 'Warthog9' Hawley" <warthog9@xxxxxxxxxxxxxx> writes:
> 
>> Afternoon everyone,
>>  
>> This is the latest incarnation of gitweb w/ caching.  This is
>> finally at the point where it should probably start either being
>> considered for inclusion or mainline, or I need to accept that this
>> will never get in and more perminantely fork (as is the case with
>> Fedora where this is going in as gitweb-caching as a parrallel rpm
>> package).
>>
>> That said this brings the base up to mainline (again),
> 
> Could you tell us which commit is the base of this series (like in
> git-request-pull output), i.e. which commit this series is rebased
> against?

This series was based on
git://git.kernel.org/pub/scm/git/git.git
054d2fa05cf0bc55fe1556c9e87d58d67a144f44

http://git.kernel.org/?p=git/warthog9/gitweb.git;a=shortlog;h=refs/heads/gitweb-ml-v5

> 
>> it updates a
>> number of elements in the caching engine, and this is a much cleaner
>> break-out of the tree vs. what I am currently developing against.
> 
> Is caching engine part changed since v2?

Slightly, not dramatically.  Only changes were to modify the caching
engine to deal with the change in output mechanism (I.E. my $output ->
print {$fh})

<snip>
>> John 'Warthog9' Hawley (9):
>>   gitweb: Load checking
> Looks good.
> 
>>   gitweb: change die_error to take "extra" argument for extended die
>>     information
> Commit message could be better (summary should really be shorter), and
> I think there is some indent typo, but otherwise looks good.
> 
>>   gitweb: Add option to force version match
> For me it needs to be disabled in gitweb test suite (t/gitweb-lib.sh),
> if it is enabled by default.  I really like that I can test current
> gitweb without need to recompile git.
> 
> Also it should have tests that it works as intended (both for matching
> and non-matching versions) in t/t9501-gitweb-standalone-http-status.sh

I'll get t9501 cleaned up and make sure that the tests default to
turning it off, and I'll add a test to confirm that this works.

>>   gitweb: Makefile improvements
> Does it differ from my proposal (i.e. gitweb/Makefile doing the work),
> based on your idea ("make gitweb" for Makefile and gitweb/Makefile)?

I think this is taken straight from the version you had, I don't think
I've modified it.

> 
>>   gitweb: add a get function to compliment print_local_time
>>   gitweb: add a get function to compliment print_sort_th
> Those two looks O.K. from what I seen.
> 
>>   gitweb: cleanup error message produced by undefined $site_header
> Shouldn't there be such protection for other such variables, like
> $site_footer and $home_text (and a bit diferent protection against
> undefined $projects_list)?  By the way, how did you arrived at
> undefined $site_header: deafult build configuration leaves it empty,
> but defined.

I would have to go back and figure it out, but it's something I hit
years ago and added that check to keep it from spewing all over my logs.
 Could easily add it to the others mentioned.

>>   gitweb: Convert output to using indirect file handle
> I have alternate solution, using shorter filehandle name (just $out)
> in
> 
>   git://repo.or.cz/git/jnareb-git.git gitweb/cache-kernel
>   http://repo.or.cz/w/git/jnareb-git.git/shortlog/refs/heads/gitweb/cache-kernel
> 
> I would have to think a bit about separate handle for binary files;
> I am not sure if it is really required.

For caching you have to have it.  When your outputing the data back from
the cache you need to switch the output mode for the browser to receive
the data properly.  Otherwise the resulting output from the caching
engine is going to be garbage.  The caching engine explicitly stores the
binary files separate from the rest of the response.

>>   gitweb: File based caching layer (from git.kernel.org)
> I am working (time permitting) in spliting this large code drop into
> smaller commits, namely:

first up, is there a reason not to take the caching layer as it stands
while you work on these?  I'm fine with adding test cases for what's
there now if you want, but I guess I'm confused about explicitly wanting
to break these into smaller commits.

>  - href(..., -path_info => 0)          (for cache key)

You actually *really* want to have the full url vs. just the path_info.
 While I accept that this means that you will end up with multiple
copies of data being stored it helps dramatically if you have multiple
sites pointing into the same caching space.  If you happen to have two
distinct trees

http://git.public.com/?p=test.git;a=summary
http://git.private.com/?p=test.git;a=summary

That respectively point to:

/group/public/git/test.git
/group/private/git/test.git

you'll end up squashing the cache files needlessly and erroneously as
what's in the cache file will depend on what last site was hit that
generated the file.

>  - simple file based caching + tests
>  - global expire time + tests
>  - output caching in gitweb            (WIP)
>  - adaptive expiration time            (planned)
>  - tee output / cache write            (planned)

You sadly can't 'tee' the output as this would re-introduce the
stampeding heard problem which is one of the reasons the caching layer
came about in the first place.  Suppose you could give one person the
output but make everyone else wait for the cache to finish writing out,
or have the waiting client processes tail the file while it's generated
but those both seem a little excessive vs. just waiting.

>  - expire time variation from CHI      (planned)
>  - locking for single writer           (planned)
>  - server-side generating info         (planned)
>  - AJAX-y generating info              (wishlist)

If it's helpful I can genuinely devote several more days to this to get
these cleaned up.  Which would save you a fair amount of time in
breaking this up.

- John 'Warthog9' Hawley
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]