Gluster-users Digest, Vol 48, Issue 49

larry.bates at vitalesafe.com (Larry Bates) · Mon, 30 Apr 2012 06:42:15 -0500

> Message: 5
> Date: Mon, 30 Apr 2012 08:39:57 +0100
> From: Brian Candler <B.Candler at pobox.com>
> Subject: Re: Bricks suggestions
> To: Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com>
> Cc: Gluster-users at gluster.org
> Message-ID: <20120430073957.GA16804 at nsrc.org>
> Content-Type: text/plain; charset=us-ascii
> 
> On Sun, Apr 29, 2012 at 11:22:20PM +0200, Gandalf Corvotempesta wrote:
>>   So, what will you do? RAID1? No raid?
> 
> RAID10 for write-active filesystems, and RAID6 for archive filesystems.
> 
>>   How does gluster detect a failed disk with no raid? What I don't
>>   understand is how gluster will detect a failure on a disk and the reply
>>   with data on the other server.
> 
> I'm not sure - that's what the risk is. One would hope that gluster would
> detect the failed disk and take it out of service, but I see a lot of posts
> on this list from people who have problems in various failure scenarios
> (failures to heal and the like).  I'm not sure that glusterfs has really got
> these situations nailed.
> 
> Indeed, in my experience the gluster client won't even reconnect to a
> glusterfsd (brick) if the brick has gone away and come back up.  You have to
> manually unmount and remount. That's about the simplest failure scenario you
> can imagine.
> 
>>   With a raid controller, if controller detect a failure, will reply with
>>   KO to the operating system
> 
> KO or OK? With a RAID controller (or software RAID), the RAID subsystem
> should quietly mark the failed drive as unusable and redirect all operations
> to the working drive.  And you will have a way to detect this situation,
> e.g. /proc/mdstat for Linux software RAID.
> 
>>   Is safer to use a 24 disks server with no raid and with 24 replicated
>>   and distributed bricks (24 on one server and 24 on other server)?
> 
> In theory they should be the same, and with replicated/distributed you also
> get the benefit that if an entire server dies, the data remains available.
> In practice I am not convinced that glusterfs will work well this way.
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Mon, 30 Apr 2012 10:53:42 +0200
> From: Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com>
> Subject: Re: Bricks suggestions
> To: Brian Candler <B.Candler at pobox.com>
> Cc: Gluster-users at gluster.org
> Message-ID:
>    <CAJH6TXh-jb=-Gus2yadhSbF94-dNdQ-iP04jH6H4wGSgNB8LHQ at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> 2012/4/30 Brian Candler <B.Candler at pobox.com>
> 
>> KO or OK? With a RAID controller (or software RAID), the RAID subsystem
>> should quietly mark the failed drive as unusable and redirect all
>> operations
>> to the working drive.  And you will have a way to detect this situation,
>> e.g. /proc/mdstat for Linux software RAID.
>> 
> 
> KO.
> As you wrote, in a raid environment, the controller will detect a failed
> disk and redirect I/O to the working drive.
> 
> With no RAID, is gluster smart enough to detect a disk failure and redirect
> all I/O to the other server?
> 
> A disk can have a damed cluster, so only a portion of itself will became
> unusable.
> A raid controller is able to detect this, gluster will do the same or still
> try to reply
> with brokend data?
> 
> So,  do you suggest to use a RAID10 on each server?
> - disk1+disk2 raid1
> - disk3+disk4 raid1
> 
> raid0 over these raid1 and then replicate it with gluster?
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gluster.org/pipermail/gluster-users/attachments/20120430/763be665/attachment-0001.htm>
> 
> ------------------------------
I have been running the following configuration for over 16 months with no issues:

Fluster V3.0.0 in two SuperMicro servers each with 8x2TB hard drives configured as JBOD. I use Gluster to replicate each drive between servers and the distribute across the drives giving me approx 16TB as a single volume.  I can pull a single drive an replace and then use self heal to rebuild. I can shutdown or reboot a server and traffic continues to the other server (good for kernel updates).  I use logdog to alert me via email/text if a drive fails.

I chose this config because it was 1) simplest, 2) maximized my disk storage, 3) effectively resulted in a shared nothing RAID10 SAN-like storage system, 4) minimized the amount of data movement during a rebuild, 5) it didn't require any hardware RAID controllers which would increase my cost.  This config has worked for me exactly as planned.  

I'm currently building a new server with 8x4TB drives and will be replacing one of the existing servers in a couple of weeks.  I will force a self heal to populate if wit files from primary server.  When done I'll repeat process for the other server.

Larry Bates
vitalEsafe, Inc.