On Thu, Mar 13, 2008 at 08:07:09PM -0400, Jay Soffian wrote: > On Thu, Mar 13, 2008 at 7:14 PM, Petr Baudis <pasky@xxxxxxx> wrote: > > diff --git a/gitweb/gitweb.css b/gitweb/gitweb.css > > index 8e2bf3d..673077a 100644 > > --- a/gitweb/gitweb.css > > +++ b/gitweb/gitweb.css > > @@ -85,6 +85,12 @@ div.title, a.title { > > color: #000000; > > } > > > > +div.stale_info { > > + display: block; > > + text-align: right; > > + font-style: italic; > > +} > > + > > div.readme { > > padding: 8px; > > } > > What does this have to do with it? The box shows that cached information is being shown. > > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl > > index bcb6193..0eee195 100755 > > --- a/gitweb/gitweb.perl > > +++ b/gitweb/gitweb.perl > > @@ -122,6 +122,15 @@ our $fallback_encoding = 'latin1'; > > ... > > > + if ($cache_lifetime and -f $cache_file) { > > + # Postpone timeout by two minutes so that we get > > + # enough time to do our job. > > + my $time = time() - $cache_lifetime + 120; > > + utime $time, $time, $cache_file; > > + } > > Race condition. I don't see any locking. Nothing keeps multiple instances from > regenerating the cache concurrently... > > > + @projects = git_get_projects_details($projlist, $check_forks); > > + if ($cache_lifetime and open (my $fd, '>'.$cache_file)) { > > ...and then clobbering each other here. You have two choices: > > 1) Use a lock file for the critical section. > > 2) Assume the race condition is rare enough, but you still need to account for > it. In that case, you want to write to a temporary file and then rename to the > cache file name. The rename is atomic, so though N instances of gitweb may > regenerate the cache (at some CPU/IO overhead), at least the cache file won't > get corrupted. You are of course right - I wanted to do the rename, but forgot to write it in the actual code. :-) There is a more conceptual problem though - in case of such big sites, it really makes more sense to explicitly regenerate the cache periodically instead of making random clients to have to wait it out. We could add a 'force_update' parameter to accept from localhost only that will always regenerate the cache, but that feels rather kludgy - can anyone think of a more elegant solution? (I don't think taking the @projects generating code out of gitweb and then having to worry during gitweb upgrades is any better.) > Out of curiosity, repo.or.cz isn't running this as a CGI is it? If so, wouldn't > running it as a FastCGI or modperl be a vast improvement? Unlikely. Currently the machine is mostly IO-bound and only small portion of CPU usage comes from gitweb itself. -- Petr "Pasky" Baudis Whatever you can do, or dream you can, begin it. Boldness has genius, power, and magic in it. -- J. W. von Goethe -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html