Hi Liam, * replies inline * On Wed, Jan 6, 2010 at 7:07 AM, Liam Slusser <lslusser at gmail.com> wrote: > Arvids & Larry, > > Interesting read, Arvids. And i completely agree. On our large raid6 > array it takes 1 week to rebuild the array from any ONE drive failure. > Its a scary time when doing a rebuild because of the decreased > performance from the array and the increased chance of a full raid > failure if we lost another two drives. Makes for a very long week of > nail biting. > > Larry brought up some great points. I, also, have been burned way to > many times by raid5 and only use it if i absolutely have to. I > normally stick to raid1/10 or raid6/60. Even with my huge raid6 > rebuild time of a week, its still faster to do that then have gluster > resync everything. The raid rebuild does affect the performance of > the box, however, so would a gluster rebuild. > > As for Larry's point #4 i duplicate the data across two boxes using > cluster/replication on top of raid6. So each box has a large raid6 > set and then dup the data between the two. So, for whatever reason, > if i did loose a whole raid array i can still recover with Gluster. > > I've also been frowned on for using desktop drives in our servers - > but on the bright side i've had very little problems with them. Of > course it did take buying a bunch of different raid cards and drives > before finding a combination that played well together. We currently > have 240 Seagate 1.5tb desktop drives in our two gluster clusters and > have only had to replaced three in the last year - two that just died > and one started to get smart errors so it was replaced. I haven't had > a problem getting Seagate to replace the drives - as they fail i ship > them off to Seagate and they send me a new one. I did figure we would > have to do support in house so we bought lots of spare parts when we > ordered everything. It was still way cheaper to buy desktop drives > and Supermicro servers with lots of spare parts than shopping at Dell, > HP or Sun - by more than half. > > Honestly my biggest peeve of Gluster is the rebuild process. Take the > OneFS file system in Isilon clusters - they are able to rebuild at a > block level - only replicating information which has changed. So even > with one node being offline all day - a rebuild/resync operation is > very quick. And have 30 billion files or 10 huge ones makes no > difference on resync speed. While with Gluster a huge directory > tree/number of files can take days if not weeks to finish. Of course > being that Gluster runs on top of a normal filesystem such as > xfs/ext3/zfs having access to block level replication may be tricky. > I honestly would not be against having the Gluster team modifying the > xfs/ext3/whatever filesystem so they could tailer it more for their > own needs - which of course would make it far less portable and much > more difficult to install and configure... > > GlusterFS does checksum based self-heal since the 3.0 release, i would believe your experiences are from 2.0? which has issues of doing a full file self-heal which will a lot of time. But i would suggest an upgrade with 3.0.1 release which is due Feb 1st week for your cluster. 3.x releases with new self-heal you should get very less rebuild times. If its possible to compare the 3.0.1 rebuild times with the One-FS from Isilon should help us improve it too. Thanks > Whatever the solution is i can tell you that the rebuild issues will > only get worse as drives continue to get larger and the number of > files/directories continue to grow. Sun's ZFS filesystem goes along > way to fix some of these problems, i just wish they would port it over > to Linux. > I would suggest wait for "brtfs". > > liam > > On Tue, Jan 5, 2010 at 2:17 PM, Arvids Godjuks <arvids.godjuks at gmail.com> > wrote: > > Consider this - a rebuild of 1.5-2 TB HDD in raid5/6 array can easily > > take up to few days to complete. At that moment your storage at that > > node will not perform well. I read a week ago very good article with > > research of this area, only thing it's in russian, but it mentions a > > few english sources too. Maybe google translate will help. > > Here's the original link: http://habrahabr.ru/blogs/hardware/78311/ > > Here's the google translate version: > > > http://translate.google.com/translate?js=y&prev=_t&hl=en&ie=UTF-8&layout=1&eotf=1&u=http%3A%2F%2Fhabrahabr.ru%2Fblogs%2Fhardware%2F78311%2F&sl=ru&tl=en > > (looks quite neet by the way) > > > > 2010/1/5 Liam Slusser <lslusser at gmail.com>: > >> Larry & All, > >> > >> I would much rather rebuild a bad drive with a raid controller then > >> have to wait for Gluster to do it. With a large number of files doing > >> a ls -aglR can take weeks. Also you don't NEED enterprise drives with > >> a raid controller, i use desktop 1.5tb Seagate drives which happy as a > >> clam on a 3ware SAS card under a SAS expander. > >> > >> liam > >> > >> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >