Search squid archive

Re: RAID is good

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matus UHLAR - fantomas wrote:
On 25.03.08 10:23, Marcus Kool wrote:
I wish that the wiki for RIAD is rewritten.

I think that (nearly) anyone can rewrite it, but...
Companies depend on internet access and a working Squid proxy
and therefore the advocated "no problem if a single disk fails"
is not from today's reality.

That's a different problem similar to one that was raised already: squid
should be able to work if one of configured cache_dirs is unavailable.
Ability to remove the cache_dir if it fails is just enhancing of this
functionality.

If there is no bugreport for this, it's time to create one...

Run-time recovery of HDD errors is on the RoadMap wishlist, awaiting someones interest.

Which as you say is a seperate problem from the non-existent one the wiki refers too: Recovery of data which is simply a redundant mirror of easily accessible data elsewhere.


One should also consider the difference between
simple RAID and extremely advanced RAID disk systems
(i.e. EMC and other arrays).
The external disk arrays like EMC with internal RAID5 are simply faster
than a JBOD of internal disks.

How many write-cycles does EMC use to backup data after one system-used write cycle? How may CPU cycles does EMC spend figuring out which disk the file-slice is located on, _after_ squid has already hashed the file location to figure out which disk the file is located on?

Regardless of speed, unless you can provide a RAID system which has less than one hardware disk-io read/write per system disk-io read/write you hit these theoretical limits.


in such case RAID1 of such disks would be even faster, if you need
reliability (for now) or raid0 or maybe JBOD using the EMD...


True RAID is becomming faster overall, or at least the servers it runs on are.

But its not so much a problem of human-noticable absolute-time as a problem of underlying duplicated disk-io-cycles and processor-io-cycles and processor delays remains.

For now the CPU half of the problem gets masked by the single-threadedness of squid (never though you'd see that being a major benefit eh?). If squid begins using all the CPU threads the OS will loose out on its spare CPU cycles on dual-core machines and RAID may become a noticable problem there.

Halving the lifetime of HDD for no benefit is not a good idea, even in wealthy large setups. And the guys running squid in high-performance situations would agree that any speed reduction is not good.

For the background;

Before I wrote that wiki page I had tested Squid on a 2.6GHz single-CPU box with RAID-mirrored drives. It runs noticably slower (and louder) than an equivalent 1.2GHz box without the RAID.

Followed last year by numerous performance help requests here in squid-users from people trying squid with RAID and seeing its removal as a large immediate performance boost.

What I have laid out in the text is the theory behind squid+RAID. If you are going to obsolete any of the information there, please provide hardware specs and run the math before doing so. You might be unpleasantly surprised.

Amos
--
Please use Squid 2.6STABLE17+ or 3.0STABLE1+
There are serious security advisories out on all earlier releases.

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux