Search squid archive

Re: High load server Disk problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Robert Pipca wrote:
Hi,

2010/8/18 Jose Ildefonso Camargo Tolosa <ildefonso.camargo@xxxxxxxxx>:
Yeah, I missed that last night (I was sleepy, I guess), thanks God you
people are around!.  Still, he would need faster disk access, unless
he is talking about 110Mbps (~12MB/s) instead of 110MB/s (~1Gbps).

So, Robert, is that 110Mbps or 1Gbps?

We have 110Mbps of HTTP network traffic (the actual bandwidth is round
250Mbps, but I'm talking about HTTP only).

But aufs doesn't seem to behave that well. I have it with XFS mounted
with noatime. I ran "squid -z" on the aufs cache_dir to see if aufs
behaves better with less objects. It does, but I still get quite a lot
of these:

2010/08/18 21:22:20| squidaio_queue_request: WARNING - Disk I/O overloading
2010/08/18 21:22:35| squidaio_queue_request: WARNING - Disk I/O overloading
2010/08/18 21:22:35| squidaio_queue_request: Queue Length:
current=533, high=777, low=321, duration=20
2010/08/18 21:22:50| squidaio_queue_request: WARNING - Disk I/O overloading
2010/08/18 21:22:50| squidaio_queue_request: Queue Length:
current=669, high=777, low=321, duration=35
2010/08/18 21:23:05| squidaio_queue_request: WARNING - Disk I/O overloading
2010/08/18 21:23:05| squidaio_queue_request: Queue Length:
current=422, high=777, low=321, duration=50
2010/08/18 21:23:22| squidaio_queue_request: WARNING - Disk I/O overloading
2010/08/18 21:41:46| squidaio_queue_request: WARNING - Queue congestion

So, duration keeps growing....so the problem will occur again.

Now, it seems that COSS behaves very nicely.

I'd like to know if I can adjust the max-size option of coss, with
something like "--with-coss-membuf-size" ? Or is really hard-coded?

It can be altered but not to anything big...


I use the aufs cache_dir to do youtube and windowsupdate caches. So if
I could increase max-size of the coss cache_dirs to around 100MB, I
could leave the aufs cache_dir to windowsupdate fles only (which are
around 300MB+). Is it possible?

No. The "buf"/slices are equivalent to swap pages for COSS. Each is swapped in/out of disk as a single slice of the total cache. Objects are arranged on them with temporal locality so that ideally requests from one website or webpage all end up together on a single slice. Theory goes being that clients only have to wait for the relevant COSS slice for their requested webpage to be swapped into RAM and all their small followup requests for .js .css and images etc get served directly from there.

Your COSS dirs are already sized at nearly 64GB each (65520 MB). With objects up to 1MB stored there. That holds most Windows updates, which are usually only a few hundred KB each. I'm not sure what your slice size is, but 15 of them are stored in RAM at any given time. You may want to increase that membuf= parameter a bit, or reduce the individual COSS dir size (requires a COSS dir erase and rebuild).


The rule-of-thumb for effective swap management seems to be storing 5min of data throughput in memory to avoid overly long disk IO wait times. Assuming an average hit-rate of around 20% that comes to needing 1min of full HTTP bandwidth in memory either (combined: cache_mem RAM cache + COSS membuf) at any given time.

Disclaimer: thats just my second-rate interpretation of a recent thesis presentation on memory vs flash vs disk service. So testing is recommended.


It may also be time for you to perform an HTTP object analysis.

This involves grabbing a period of the logs and counting how many objects go through your proxy, grouped in regular size brackets (0-512, -1K, -2K, -4KB, 8K, 16K, 32K ... ) etc.
[there are likely tools out there that do this for you.]

There are three peaks that appear in these counts: one usually near zero for the IMS requests; one in the low 4KB-128KB for the general image/page/script content; and one around the low 1MB-50MB for video media objects. Between these last two peaks there is a dip. IMO the min-size/max-size boundary between COSS and AUFS should be there around the low point of the dip somewhere.

The bigger group of objects are popular, but too large for COSS to swap in/out efficiently, AUFS handles these very nicely. The objects in the smaller bump are the reverse, too small to wait for individual AUFS swapin/out and likely to be custered in the highly inter-related webpage bunches that COSS handles very well.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.6
  Beta testers wanted for 3.2.0.1


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux