The disk access method has a rather huge influence on throughput. There's been plenty of papers written over the last ten years relating to disk access patterns and a number of them are specific to web caching. Its a well-known fact that using the unix filesystem in a one-file-per-object method is generally inefficient - there's >1 operation per create/write, open/read and unlink. If a web cache is populated with a 'normal' webcache distribution then you'll find the majority of cache objects (~95% in my live caches at work) are under 64k in size. Many (~50% I think, I'd have to go through my notes) are under 32k in size. So it boils down to a few things: * arranging the disk writes in a way to cut back on the amount of seeking during disk writes * arranging the disk writes in a way to cut back on the amount of seeking during disk reads * handling replacement policies in an efficient way - eg, you don't want to have high levels of fragmentation as time goes on as this may impact on your ability to batch disk writes * disk throughput is a function of how you lay out the disk writes and how you queue disk reads - and disks are smokingly fast when you're able to do big reads and writes with minimal seeks. Now, the Squid UFS method of laying out things is inefficient because: * Read/Write/Unlink operations involve more than one disk IO in some cases, * Modern UNIX FSes have this habit of using synchronous journalling of metadata - which also slows things down unless you're careful (BSD Softupdates UFS doesn't do this as a specific counter-example) * There's no way to optimise disk reads/write patterns by influencing the on-disk layout - as an example, UNIX FSes tend to 'group' files in a directory close together on the disk (same cylinder group in the case of BSD FFS) but squid doesn't put files from the same site - or even downloaded from the same client at any given time, in the same directory. * Sites are split up between disks - which may have an influence on scheduling reads from hits (ie, if you look at a webpage and it has 40 objects that are spread across 5 disks, that one client is going to issue disk requests to all /five/ disks rather than the more optimal idea of stuffing those objects sequentially on the one disk and reading them all at once.) * it all boils down to too much disk seeking! Now, just as an random data point. I'm able to pull 3 megabytes a second of random-read hits (~200 hits a second) from a single COSS disk. The disk isn't running anywhere near capacity even with this inefficient read pattern. This is from a SATA disk with no tagged-queueing. The main problem with COSS (besides the bugs :) is that the write rate is a function of both the data you're storing from server replies (cachable data) and the "hits" which result in objects being relocated. Higher request rate == high write rate (storing read objects on disk), high hit rate == higher write rate (storing read and relocated objects on disk.) This one disk system smokes a similar setup AUFS/DISKD on XFS and EXT3. No, I don't have exact figures - I'm doing this for fun rather than a graduate/honours project with a paper in mind. But even Duane's COSS polygraph results from a few years ago show COSS is quite noticably faster than AUFS/DISKD. The papers I read from 1998-2002 were talking about obtaining random reads from disks at a read rate of ~500 objects a second (being <64k in size.) Thats per disk. In 1998. :) So, its getting done in my spare time. And it'll turn a Squid server into something comparable to the commercial caches from 2001 :) (Ie, ~2400 req/sec with the polygraph workloads with whatever offered hit rate closely matched.) I can only imagine what they're able to achieve today with such tightly-optimised codebases. Adrian On Tue, Jul 11, 2006, H wrote: > Hi > I am not so sure if the particular data access method is what makes the > difference. Most real cases are bound to disk or other hardware limitations. > Even if often discussed IDE/ATA disks do not come close to SCSI disk > throughput in multi user environments. Standard PCs are having often exactly > this limit of 2-5MB/s Rick says and you can do what you want there is nothing > more. I believe that squid, when coming to the limit simple do not cache > anymore and goes directly, means the cache server certainly runs useless on > the edge and not caching. > With good hardware, not necessarily server MBs, you can get 30MB/s as you say > but I am not sure how much of this 30MB/s is cache data, do you get 5% or > less from disk? > We have some high bandwidth networks where we use squid on the main server as > non-caching server. And then several parents where the cache-to-disk process > is done. The main server seems to be bound only to the OS-pps limit (no disk > access) and we get up to 90MB/s through it. The parent caches are queried by > content type or object size. Of course the connection between this servers is > GBit full duplex. We get this way up to 20% less bandwidth utilization. Times > ago we got up to 40% but since emule and other ptp are very popular things > are not so good anymore. > What we use are FreeBSD servers 6.1-Stable version with squid14 as transparent > proxy on AMD64 dual-opterons on the main servers and AMD64-X2 machines on the > parent caches, all with SCSI-320 and very good and lots of memory. Main > server 16GB up and the parents 4GB. Best experience and performance for > standard hardware I got with Epox MB and AMD-X2 4400 or 4800. I run more than > one squid process on each SMP server. > > Hans > > > > > > > > A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura. > Service fornecido pelo Datacenter Matik https://datacenter.matik.com.br