On Tue, 23 Jun 2009 21:18:36 -0200, "Ronan Lucio" <listas@xxxxxxxxxxxx> wrote: > Hi Kinkie, > > On Tue, 23 Jun 2009 21:51:17 +0200, Kinkie wrote >> Hi, >> I can't see the advantage of using lighthttpd instead of squid+carp >> as the frontend, > > The idea of putting a lighttpd server as a the frontend is for load > balance. > > What exactly do you mean with squid+carp? several squid servers working as > one? Squid placed as load balancer. Using CARP selection protocol for the balancing. These top-layer Squid generally don't cache, but are memory-only very high throughput services. CARP ensures that URL are sent consistently to the second layer of Squid for most efficient (non-duplicate) caching and failover. The wikimedia deployment does it this way for their front-end. http://meta.wikimedia.org/wiki/File:Wikimedia-servers-2009-04-05.svg > Can I have it working in an external DataCenter? Most likely. As with any HTTP hierarchy the location of the individual hops is not relevant to the traffic flow. But for best performance results the underlying network topology and capacity should be considered. What the Squid+carp offers that lighthttpd does not AFAIK is the CARP algorithm for 'sticky' URL. So that objects are not duplicated around all the caches. Some duplicate slippage may occur when peers die/return. But its much less than would normally occur. > If so it seems to be a better solution, even because it's a fault tolerance > solution. > >> and if using lighthttpd i can't see the advantage of >> not serving static content directly out of the balancer. > > Actually, I'm just afraid of overload the server. > Initially I don't know exactly how much resources would it consume from > each > server. > If a server like that fits executing two roles, I'm sure it would be > better. > >> Also watch out as nfs has locking and scaling issues of its own >> (assuming thet nfs is what you mean by "single filesystem"), and it >> also introduces a very nasty point-of-failure. > > Yes, it's a NAS. What kinkie means is that the efficiency is determined by the type of NAS. Squid performs a high-churn random-access IO when in full operation. And needs to do so at the highest possible speed. The absolute _best_ solution is to have a high speed storage device spindle dedicated to just one cache_dir in one Squid. None of the SAN/NAS I'm aware of can make that kind of find-grained assignment. That said, modern hardware SAN/NAS solutions and even some newer software ones are very efficient and can provide useful service levels. But the traditional nfs and samba file system NAS can potentially introduce serious speed issues when scaled. 1ms may not sound like much IO wait. But when it affects 3K concurrent connections simultaneously (ie one loaded Squid) that scales up towards a 3sec delay in every request. > > Kinkie, the architecture shouldn't be that suggested from me. > It's just how I could figure out. Of course I want to make it better. > Do you have a suggestion for that? > > For all I have understood your suggestion is: > > 1) Some squid servers + carp Almost... 1a) Squid load balancer using carp to select layer-2. 1b) Squid caching servers then whatever backend you like... > > 2) Application server as the backend servers > > 3) A third server serving static resources It's up to you whether the system is large enough to require separate (2) and (3). For small enough (for some value of small) its sufficient to combine them and provide caching headers to push most of the static work into the (1b) Squid. > > I just didn't figure out your suggestion for storage. Hopefully my comment above has clarified that a little. Amos