Re: [PATCH] gitweb: Support caching projects list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 13, 2008 at 08:07:09PM -0400, Jay Soffian wrote:
> On Thu, Mar 13, 2008 at 7:14 PM, Petr Baudis <pasky@xxxxxxx> wrote:
> >  diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css
> >  index 8e2bf3d..673077a 100644
> >  --- a/gitweb/gitweb.css
> >  +++ b/gitweb/gitweb.css
> >  @@ -85,6 +85,12 @@ div.title, a.title {
> >         color: #000000;
> >   }
> >
> >  +div.stale_info {
> >  +       display: block;
> >  +       text-align: right;
> >  +       font-style: italic;
> >  +}
> >  +
> >   div.readme {
> >         padding: 8px;
> >   }
> 
> What does this have to do with it?

The box shows that cached information is being shown.

> >  diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> >  index bcb6193..0eee195 100755
> >  --- a/gitweb/gitweb.perl
> >  +++ b/gitweb/gitweb.perl
> >  @@ -122,6 +122,15 @@ our $fallback_encoding = 'latin1';
> 
> ...
> 
> >  +               if ($cache_lifetime and -f $cache_file) {
> >  +                       # Postpone timeout by two minutes so that we get
> >  +                       # enough time to do our job.
> >  +                       my $time = time() - $cache_lifetime + 120;
> >  +                       utime $time, $time, $cache_file;
> >  +               }
> 
> Race condition. I don't see any locking. Nothing keeps multiple instances from
> regenerating the cache concurrently...
> 
> >  +               @projects = git_get_projects_details($projlist, $check_forks);
> >  +               if ($cache_lifetime and open (my $fd, '>'.$cache_file)) {
> 
> ...and then clobbering each other here. You have two choices:
> 
> 1) Use a lock file for the critical section.
> 
> 2) Assume the race condition is rare enough, but you still need to account for
> it. In that case, you want to write to a temporary file and then rename to the
> cache file name. The rename is atomic, so though N instances of gitweb may
> regenerate the cache (at some CPU/IO overhead), at least the cache file won't
> get corrupted.

You are of course right - I wanted to do the rename, but forgot to write
it in the actual code. :-)

There is a more conceptual problem though - in case of such big sites,
it really makes more sense to explicitly regenerate the cache
periodically instead of making random clients to have to wait it out.
We could add a 'force_update' parameter to accept from localhost only
that will always regenerate the cache, but that feels rather kludgy -
can anyone think of a more elegant solution? (I don't think taking the
@projects generating code out of gitweb and then having to worry during
gitweb upgrades is any better.)

> Out of curiosity, repo.or.cz isn't running this as a CGI is it? If so, wouldn't
> running it as a FastCGI or modperl be a vast improvement?

Unlikely. Currently the machine is mostly IO-bound and only small
portion of CPU usage comes from gitweb itself.

-- 
				Petr "Pasky" Baudis
Whatever you can do, or dream you can, begin it.
Boldness has genius, power, and magic in it.	-- J. W. von Goethe
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux