Larry & All, I would much rather rebuild a bad drive with a raid controller then have to wait for Gluster to do it. With a large number of files doing a ls -aglR can take weeks. Also you don't NEED enterprise drives with a raid controller, i use desktop 1.5tb Seagate drives which happy as a clam on a 3ware SAS card under a SAS expander. liam On Thu, Dec 17, 2009 at 8:17 AM, Larry Bates <larry.bates at vitalesafe.com> wrote: > Phi.l, > > I think the real question you need to ask has to do with why we are using > GlusterFS at all and what happens when something fails. ?Normally GlusterFS > is used to provide scalability, redundancy/recovery, and performance. ?For > many applications performance will be the least of the worries so we > concentrate on scalability and redundancy/recovery. ?Scalability can be > achieved no matter which way you configure your servers. ?Using distribute > translator (DHT) you can unify all the servers into a single virtual storage > space. ?The problem comes when you look at what happens when you have a > machine/drive failures and need the redundancy/recovery capabilities of > GlusterFS. ?By putting 36Tb of storage on a single server and exposing it as > a single volume (using either hardware or software RAID), you will have to > replicate that to a replacement server after a failure. ?Replicating 36Tb > will take a lot of time and CPU cycles. ?If you keep things simple (JBOD) > and use AFR to replicate drives between servers and use DHT to unify > everything together, now you only have to move 1.5Tb/2Tb when a drive fails. > ?You will also note that you get to use 100% of your disk storage this way > instead of wasting 1 drive per array with RAID5 or two drives with RAID6. > ?Normally with RAID5/6 it is also imperative that you have a hot spare per > array, which means you waste an additional driver per array. ?To make > RAID5/6 work with no single point of failure you have to do something like > RAID50/60 across two controllers which gets expensive and much more > difficult to manage and to grow. ?Implementing GlusterFS using more modest > hardware makes all those "issues" go away. ?Just use GlusterFS to provide > the RAID-like capabilities (via AFR and DHT). > > Personally I doubt that I would set up my storage the way you describe. ?I > probably would (and have) set it up with more smaller servers. ?Something > like three times as many 2U servers with 8x2Tb drives each (or even 6 times > as many 1U servers with 4x2Tb drives each) and forget the expensive RAID > SATA controllers, they aren't necessary and are just a single point of > failure that you can eliminate. ?In addition you will enjoy significant > performance improvements because you have: > > 1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers). ?Gigabit > Ethernet is fast, but still will limit bandwidth to a single machine. > 2) Write performance on RAID5/6 is never going to be as fast as JBOD. > 3) You should have much more memory caching available (36x8Gb = 256Gb memory > or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb) > 4) Management of the storage is done in one place..GlusterFS. ?No messy RAID > controller setups to document/remember. > 5) You can expand in the future in a much more granular and controlled > fashion. ?Add 2 machines (1 for replication) and you get 8Tb (using 2Tb > drives) of storage. ?When you want to replace a machine, just set up new > one, fail the old one, and let GlusterFS build the new one for you (AFR will > do the heavy lifting). ?CPUs will get faster, hard drives will get faster > and bigger in the future, so make it easy to upgrade. ?A small number of BIG > machines makes it a lot harder to do upgrades as new hardware becomes > available. > 6) Machine failures (motherboard, power supply, etc.) will effect much less > of your storage network. ?Having a spare 1U machine around as a hot spare > doesn't cost much (maybe $1200). ?Having a spare 5U monster around does > (probably close to $6000). > > IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe less) > than the big boxes you are looking to buy. ?They are commodity items. ?If > you go the 1U route you don't need anything but a machine, with memory and 4 > hard drives (all server motherboards come with at least 4 SATA ports). ?By > using 2Tb drives, I think you would find that the cost would be actually > less. ?By NOT using hardware RAID you can also NOT use RAID-class hard > drives which cost about $100 each more than non-RAID hard drives. ?Just that > change alone could save you 6 x 24 = 144 x $100 = $14,400! ?JBOD just > doesn't need RAID-class hard drives because you don't need the sophisticated > firmware that the RAID-class hard drives provide. ?You still will want > quality hard drives, but failures will have such a low impact that it is > much less of a problem. > > By using more smaller machines you also eliminate the need for redundant > power supplies (which would be a requirement in your large boxes because it > would be a single point of failure on a large percentage of your storage > system). > > Hope the information helps. > > Regards, > Larry Bates > > > ------------------------------ >> >> Message: 6 >> Date: Thu, 17 Dec 2009 00:18:54 -0600 >> From: phil cryer <phil at cryer.us> >> Subject: Recommended GlusterFS configuration for 6 >> ? ? ? ?node ? ?cluster >> To: "gluster-users at gluster.org" <gluster-users at gluster.org> >> Message-ID: >> ? ? ? ?<3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> We're setting up 6 servers, each with 24 x 1.5TB drives, the systems >> will run Debian testing and Gluster 3.x. ?The SATA RAID card offers >> RAID5 and RAID6, we're wondering what the optimum setup would be for >> this configuration. ?Do we RAID5 the disks, and have GlusterFS use >> them that way, or do we keep them all 'raw' and have GlusterFS handle >> the replication (though not 2x as we would have with the RAID >> options)? ?Obviously a lot of ways to do this, just wondering what >> GlusterFS devs and other experienced users would recommend. >> >> Thanks >> >> P >> > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >