Re: kernel.org mirroring (Re: [GIT PULL] MMC update)

"H. Peter Anvin" <hpa@xxxxxxxxx> · Sat, 09 Dec 2006 08:26:15 -0800

Martin Langhoff wrote:
On 12/9/06, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
Martin Langhoff wrote:
> I posted separately about those. And I've been mulling about whether
> the thundering herd is really such a big problem that we need to
> address it head-on.

Uhm... yes it is.

Got some more info, discussion points or links to stuff I should read
to appreciate why that is? I am trying to articulate why I consider it
is not a high-payoff task, as well as describing how to tackle it.

To recap, the reasons it is not high payoff is that:

- the main benefit comes from being cacheable and able to revalidate
the cache cheaply (with the ETags-based strategy discussed above)
- highly distributed caches/proxies means we'll seldom see a true
cold cache situation
- we have a huge set of URLs which are seldom hit, and will never see
a thundering anything
- we have a tiny set of very popular URLs that are the key target for
the thundering herd - (projects page, summary page, shortlog, fulllog)
- but those are in the clear as soon as the caches are populated

Why do we have to take it head-on? :-)

Because the primary failure scenario is timeout on the common queries 
due to excess parallel invocations under high I/O load resulting in 
catastrophic failure.

	-hpa
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html