This is *very* helpful, thanks for taking the time Larry! Looking forward to giving feedback once we have the cluster up. P On Thu, Dec 17, 2009 at 11:23 AM, Tejas N. Bhise <tejas at gluster.com> wrote: > Thanks, Larry, for the comprehensive information. > > Phil, I hope that answers a lot of your questions. Feel free to ask more, we have a great community here. > > Regards, > Tejas. > > ----- Original Message ----- > From: "Larry Bates" <larry.bates at vitalesafe.com> > To: gluster-users at gluster.org, phil at cryer.us > Sent: Thursday, December 17, 2009 9:47:30 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi > Subject: Re: Gluster-users Digest, Vol 20, Issue 22 > > Phi.l, > > I think the real question you need to ask has to do with why we are > using GlusterFS at all and what happens when something fails. ?Normally > GlusterFS is used to provide scalability, redundancy/recovery, and > performance. ?For many applications performance will be the least of the > worries so we concentrate on scalability and redundancy/recovery. > Scalability can be achieved no matter which way you configure your > servers. ?Using distribute translator (DHT) you can unify all the > servers into a single virtual storage space. ?The problem comes when you > look at what happens when you have a machine/drive failures and need the > redundancy/recovery capabilities of GlusterFS. ?By putting 36Tb of > storage on a single server and exposing it as a single volume (using > either hardware or software RAID), you will have to replicate that to a > replacement server after a failure. ?Replicating 36Tb will take a lot of > time and CPU cycles. ?If you keep things simple (JBOD) and use AFR to > replicate drives between servers and use DHT to unify everything > together, now you only have to move 1.5Tb/2Tb when a drive fails. ?You > will also note that you get to use 100% of your disk storage this way > instead of wasting 1 drive per array with RAID5 or two drives with > RAID6. ?Normally with RAID5/6 it is also imperative that you have a hot > spare per array, which means you waste an additional driver per array. > To make RAID5/6 work with no single point of failure you have to do > something like RAID50/60 across two controllers which gets expensive and > much more difficult to manage and to grow. ?Implementing GlusterFS using > more modest hardware makes all those "issues" go away. ?Just use > GlusterFS to provide the RAID-like capabilities (via AFR and DHT). > > Personally I doubt that I would set up my storage the way you describe. > I probably would (and have) set it up with more smaller servers. > Something like three times as many 2U servers with 8x2Tb drives each (or > even 6 times as many 1U servers with 4x2Tb drives each) and forget the > expensive RAID SATA controllers, they aren't necessary and are just a > single point of failure that you can eliminate. ?In addition you will > enjoy significant performance improvements because you have: > > 1) Many parallel paths to storage (36x1U or 18x2U vs 6x5U servers). > Gigabit Ethernet is fast, but still will limit bandwidth to a single > machine. > 2) Write performance on RAID5/6 is never going to be as fast as JBOD. > 3) You should have much more memory caching available (36x8Gb = 256Gb > memory or 18x8Gb memory = 128Gb vs maybe 6x16Gb = 96Gb) > 4) Management of the storage is done in one place..GlusterFS. ?No messy > RAID controller setups to document/remember. > 5) You can expand in the future in a much more granular and controlled > fashion. ?Add 2 machines (1 for replication) and you get 8Tb (using 2Tb > drives) of storage. ?When you want to replace a machine, just set up new > one, fail the old one, and let GlusterFS build the new one for you (AFR > will do the heavy lifting). ?CPUs will get faster, hard drives will get > faster and bigger in the future, so make it easy to upgrade. ?A small > number of BIG machines makes it a lot harder to do upgrades as new > hardware becomes available. > 6) Machine failures (motherboard, power supply, etc.) will effect much > less of your storage network. ?Having a spare 1U machine around as a hot > spare doesn't cost much (maybe $1200). ?Having a spare 5U monster around > does (probably close to $6000). > > IMHO 36 x 1U or 18 x 2U servers shouldn't cost any more (and maybe less) > than the big boxes you are looking to buy. ?They are commodity items. > If you go the 1U route you don't need anything but a machine, with > memory and 4 hard drives (all server motherboards come with at least 4 > SATA ports). ?By using 2Tb drives, I think you would find that the cost > would be actually less. ?By NOT using hardware RAID you can also NOT use > RAID-class hard drives which cost about $100 each more than non-RAID > hard drives. ?Just that change alone could save you 6 x 24 = 144 x $100 > = $14,400! ?JBOD just doesn't need RAID-class hard drives because you > don't need the sophisticated firmware that the RAID-class hard drives > provide. ?You still will want quality hard drives, but failures will > have such a low impact that it is much less of a problem. > > By using more smaller machines you also eliminate the need for redundant > power supplies (which would be a requirement in your large boxes because > it would be a single point of failure on a large percentage of your > storage system). > > Hope the information helps. > > Regards, > Larry Bates > > > ------------------------------ >> Message: 6 >> Date: Thu, 17 Dec 2009 00:18:54 -0600 >> From: phil cryer <phil at cryer.us> >> Subject: Recommended GlusterFS configuration for 6 >> ? ? ? node ? ?cluster >> To: "gluster-users at gluster.org" <gluster-users at gluster.org> >> Message-ID: >> ? ? ? <3a3bc55a0912162218i4e3f326cr9956dd37132bfc19 at mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> We're setting up 6 servers, each with 24 x 1.5TB drives, the systems >> will run Debian testing and Gluster 3.x. ?The SATA RAID card offers >> RAID5 and RAID6, we're wondering what the optimum setup would be for >> this configuration. ?Do we RAID5 the disks, and have GlusterFS use >> them that way, or do we keep them all 'raw' and have GlusterFS handle >> the replication (though not 2x as we would have with the RAID >> options)? ?Obviously a lot of ways to do this, just wondering what >> GlusterFS devs and other experienced users would recommend. >> >> Thanks >> >> P >> > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > -- http://philcryer.com