RE: Performance problems - need some advice

"Baumgaertel, Oliver" <oliver.baumgaertel@xxxxxxxxxxx> · Wed, 8 Feb 2006 10:37:17 +0100

Hi Jeremy.

Indeed there is no gain to expect when you try to accelerate pure large
static content via a Proxy or accelerator. Even Apache is usually "good"
enough to be slowed down by the hdd. Unless of course you have to serve
a lot parallel requests at once, or you produce basically static pages
with dynamic script calls and/or use complex url rewrite rules and the
like in Apache. Large static files are always served from disk, doesn't
matter if it's a cache-entry or a normal file. A lot small files would
benefit from a large ram cache, but here it also doesn't matter if it's
a cache in the webserver itself or in the proxy or accelerator.

Honestly, for your kind of problem you should look into replacing the
Apache farm with a farm for only serving static content. That's the way
the most heavy traffic domains handle things too. There are several free
servers out there that would improve your situation. Free ones would be
thttpd, lighttpd, boa, Tux or even my hssTVS. On the comercial side of
things you can find names like Zeus, LiteSpeed, premium thttpd, lighttpd
professional and so on.

-----Original Message-----
From: Jeremy Utley [mailto:jerutley@xxxxxxxxx] 
Sent: Mittwoch, 8. Februar 2006 01:30
To: Squid ML
Subject: Re:  Performance problems - need some advice

On 2/7/06, Kinkie <kinkie-squid@xxxxxxxxx> wrote:
> On Tue, 2006-02-07 at 12:49 -0800, Jeremy Utley wrote:
> > On 2/7/06, Kinkie <kinkie-squid@xxxxxxxxx> wrote:
> >
> > > Profiling your server would be the first step.
> > > How does it spend its CPU time? Within the kernel? Within the
squid
> > > process? In iowait? What's the number of open filedescriptors in
Squid
> > > (you can gather that from the cachemgr)? And what about disk load?
How
> > > much RAM does the server have, how much of it is used by squid?
> >
> > I was monitoring the servers as we brought them online last night in
> > most respects - I wasn't monitoring file descriptor usage, but I do
> > have squid patched to support more than the standard number of file
> > descriptors, and am using the ulimit command according to the FAQ.
>
> That can be a bottleneck if you're building up a SYN backlog. Possible
> but relatively unlikely.
>
> > When I was monitoring, squid was still building it's cache, and
squid
> > was using most of the system memory at that time.  It seems our
major
> > bottleneck is in Disk I/O - if squid can fulfill a request out of
> > memory, everything is fine, but if it has to go to the disk cache,
> > performance suffers.
>
> That can be expected to a degree. So are you seeing lots of IOWait in
> the system stats?

During our last test run, the machines were running at around 30-50%
in iowait time, according to iostat.

>
> >   Right now, we have 5 18GB SCSI disks placing our
> > cache, 2 of those are on the primary SCSI controller with the OS
disk,
> > the other 3 on the secondary.
>
> How are the cache disks arranged? RAID? No RAID (aka JBOD)?

Right now, no raid is involved at all.  Each cache disk has a single
partition on it, occupying the entire disk, and each partition is
mounted to a separate directory:

/dev/sdb1 -> /cache1
/dev/sdc1 -> /cache2
/dev/sdd1 -> /cache3
/dev/sde1 -> /cache4
/dev/sdf1 -> /cache5

Each one has it's own cache_dir line in the squid.conf file.

>
> >   Could there perhaps be better
> > performance with one larger disk on one controller with the OS disk,
> > and another larger disk on the secondary controller?
>
> No, in general more spindles are good because they can perform in
> parallel. What kind of cache_dir system are you using? aufs? diskd?

Our initial testing last night used the normal ufs - we just switched
over to aufs (posts I found on the squid ML said this would be better
for Linux systems), and it had a very much noticeable improvement in
performance.

>
> > We're also
> > probably a little low on RAM in the machines - each of the 2 current
> > squid servers have 2GB of ram installed.
>
> I assume that you're serving much more content than that, right?

Of course.  Our total content being served by this cluster right now
is close to 200GB total right now, and expected to grow farther.

>
> > Right now, we have 4 Apache servers in a cluster, and these machines
> > currently max out at about 300Mb/s.  Our hope is to utilize squid to
> > push this up to about 500Mb/s, if possible.  Has anyone out there
ever
> > gotten a squid server to push that kind of traffic?  Again, the
files
> > served from these servers range from a few hundred KB to around 4MB
in
> > size.
>
> In raw terms, Apache should outperform Squid due to more specific OS
> support. Squid outperforms Apache in flexibility, manageability and by
> offering more control over the server and what the clients can and
> cannot do.

This seems surprising to me, honestly.  If Squid, utilizing it's
caching ability, can't push out data faster than Apache, then it seems
there wouldn't be any reason to use it as an http accelerator like
this.  Maybe there's something I'm missing in your statement,

>
> Please keep the discussion on the mailing-list. It helps get more
ideas
> and also it can provide valuable feedback for others who might be
> interested in the same topics.

I never intended to take the discussion off-list, but when I hit
reply, it went to you instead of the list :(

Jeremy