Re: is there a fast web-interface to git for huge repos?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/07/2013 01:02 PM, Constantine A. Murenin wrote:
>> That's a one-time penalty. Why would that be a problem? And why is wget
>> even mentioned? Did we misunderstood eachother?
> 
> `wget` or `curl --head` would be used to trigger the caching.
> 
> I don't understand how it's a one-time penalty.  Noone wants to look
> at an old copy of the repository, so, pretty much, if, say, I want to
> have a gitweb of all 4 BSDs, updated daily, then, pretty much, even
> with lots of ram (e.g. to eliminate the cold-case 5s penalty, and
> reduce each page to 0.5s), on a quad-core box, I'd be kinda be lucky
> to complete a generation of all the pages within 12h or so, obviously
> using the machine at, or above, 50% capacity just for the caching.  Or
> several days or even a couple of weeks on an Intel Atom or VIA Nano
> with 2GB of RAM or so.  Obviously not acceptable, there has to be a
> better solution.
> 
> One could, I guess, only regenerate the pages which have changed, but
> it still sounds like an ugly solution, where you'd have to be
> generating a list of files that have changed between one gen and the
> next, and you'd still have to have a very high cpu, cache and storage
> requirements.

Have you already ruled out caching on a proxy?  Pages would only be generated
on demand, so the first visitor would still experience the delay but the rest
would be fast until the page expires.  Even expiring pages as often as five
minutes or less would probably provide significant processing savings
(depending on how many users you have), and that level of staleness and the
occasional delays may be acceptable to your users.

As you say, generating the entire cache upfront and continuously is wasteful
and probably unrealistic, but any type of caching, by definition, is going to
involve users seeing stale content, and I don't see that you have any other
option but some type of caching.  Well, you could reproduce what git does in a
bunch of distributed algorithms and run your app on a farm--which, I guess, is
probably what GitHub is doing--but throwing up a caching reverse proxy is a
lot quicker if you can accept the caveats.

-- 
Charles McGarvey

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]