I think that if you can use a good Disc controller (with 1G+ of cache) and make: 1 Raid10 for the SO with 4 discs 2 RAID10 for 2 disc_cache storages for squid with 4 discs each (or even 2 RAID5 with 3 discs each) you cold have a good I/O speed. -- Att... Ricardo Felipe Klein klein.rfk@xxxxxxxxx On Fri, Jun 14, 2013 at 7:44 PM, Ricardo Klein <klein.rfk@xxxxxxxxx> wrote: > I think that if you can use a good Disc controller (with 1G+ of cache) and > make: > 1 Raid10 for the SO with 4 discs > 2 RAID10 for 2 disc_cache storages for squid with 4 discs each (or even 2 > RAID5 with 3 discs each) > > you cold have a good I/O speed. > > > > -- > Att... > > Ricardo Felipe Klein > klein.rfk@xxxxxxxxx > > > On Fri, Jun 14, 2013 at 7:29 PM, Stephan Viljoen <steph@xxxxxxxxxxxx> wrote: >> >> I was thinking of buying a super micro server with 8 to 16 drive bays and >> fill it with (15K RPM SAS disks) as I need more disk i/o. I guess a pure >> memory system will still be the fastest option but I'm looking for >> something >> in between speeding up browsing and saving as much bandwidth as possible >> without sacrificing to much speed. >> >> >> >> -----Original Message----- >> From: Marcus Kool [mailto:Marcus.Kool@xxxxxxxxxxxxxxx] >> Sent: Friday, June 14, 2013 5:35 PM >> To: csn233 >> Cc: Stephan Viljoen; squid-users@xxxxxxxxxxxxxxx; support and sales desk >> URLfilterDB >> Subject: Re: Squid Hardware requirements. >> >> On Fri, Jun 14, 2013 at 09:53:20PM +0800, csn233 wrote: >> > With YMMV in mind, I get different mileage: >> > >> > On Fri, Jun 14, 2013 at 7:41 PM, Marcus Kool >> > <marcus.kool@xxxxxxxxxxxxxxx> wrote: >> > > and if your network pipe has sufficient capacity, also fetching an >> > > object again from the internet is can be faster than fetching from >> > > disk. >> > >> > Your network may be fast, but it doesn't imply a fast path between you >> > and the origin server. In other words, it depends on other factors >> > than just your own network pipe. >> >> yes, mileage may vary and depends on many factors. >> Overall, squid servers without disk cache can be faster than with disk >> cache, so it is worth looking at it. >> >> > > - more expensive (disks + battery-backed I/O controller) >> > >> > Expensive disks/battery-backed are over-kill. More/adequate spindles >> > should do the job just as well. Why do you need a battery-backed >> > controller? Squid is not a transaction-based system - if you lose the >> > cache, tough, do "squid -z" and start again. >> >> fast disks are good. multiple controllers and mutiple buses are good. >> An EMC disk array is the most expensive and best option since Squid >> desires >> a huge number of IOPS. >> Battery-backed disk controllers are a good tradeoff: they are not so >> expensive and give a reasonable performance boost. >> >> > > - Squid uses more memory to index the disk cache (14 MB memory per >> > > GB disk >> > > cache) >> > >> > My memory allocation is only about 20-30% of that (formula), and >> > paging/swapping metrics doesn't indicate there is a problem. General >> > formulas may not always apply. >> >> The 14 MB per GB is documented in the Squid wiki and based on the >> observation that the avergae object size is 13 KB. >> If you only have 20-30% of the formula you may have a larger average >> object >> size or only use 20-30% of the confgured disk cache. >> >> > > unless a redundant hot-swap RAID array is used, less downtime. >> > >> > Older versions has a problem if a cache_dir fails, I think. Has this >> > changed with later versions, or in the pipeline to change, anyone? >> >> The thread started with a web proxy for an ISP. >> ISPs generally do not want to restart the proxy and/or rebuild the index. >> It takes too long. >> >> > > One can also redistribute budget: >> > > - use the budget of the disk system to max out memory. >> > >> > The benefits of memory will plateau pretty quickly. Unless one >> > regularly has a whole bunch of users wanting to access the same pages >> > within a relatively short time, the benefit from more memory has its >> > limits. Max-out could easily become wastage. >> >> No, memory is by far the fastest cache media. Since memory is relatively >> cheap it is the best option. >> >> > > - put as much memory as possible. >> > >> > Disagree - see above. It depends. >> >> Ok, I stated it a bit aggressive. It should read "Buy as much memory as >> your >> budget allows". >> >> > > - carefully size the disk cache; not too large since Squid keeps the >> > > index >> > >> > Agree. If your hit-ratios don't increase, there's not much point in >> > having larger cache_dir's. But I wouldn't go as far as "carefully". >> > You just need enough or more, just not too much more. >> >> That is your point of view. I prefer to be careful not to use more than >> enough since it wastes memory. >> >> > > - if using a disk cache, use fast disks and a very good caching I/O >> > > controller to get maximum disk performance >> > >> > Up to a point only, as mentioned above. Local disk I/O may be fast, >> > but it doesn't mean your internet access will be as well. Which means >> > you end up spending money on hardware that does not deliver actual >> > results. >> >> Squid is hungry for a large number of IOPS. So get the best that your >> budget >> can buy. >> For low budgets this is a relatively cheap caching disk controller, for >> high >> budgets it varies between low-end and high-end disk arrays (the ones that >> have between 32 and 1000+ of spindles). >> >> > As Amos said, get the fastest per-core GHz you can find, number of >> > cores not important. And have enough disk spindles. >> >> >