GlusterFS on a two-node setup

jjolet at drillinginfo.com (John Jolet) · Mon, 21 May 2012 00:46:33 +0000

On May 20, 2012, at 4:55 PM, Ramon Diaz-Uriarte wrote:

> 
> 
> 
> On Sun, 20 May 2012 20:38:02 +0100,Brian Candler <B.Candler at pobox.com> wrote:
>> On Sun, May 20, 2012 at 01:26:51AM +0200, Ramon Diaz-Uriarte wrote:
>>> Questions:
>>> ==========
>>> 
>>> 1. Is using GlusterFS an overkill? (I guess the alternative would be to use
>>>  NFS from one of the nodes to the other)
> 
>> In my opinion, the other main option you should be looking at is DRBD
>> (www.drbd.org).  This works at the block level, unlike glusterfs which works
>> at the file level.  Using this you can mirror your disk remotely.
> 
> 
> Brian, thanks for your reply. 
> 
> 
> I might have to look at DRBD more carefully, but I do not think it fits my
> needs: I need both nodes to be working (and thus doing I/O) at the same
> time. These are basically number crunching nodes and data needs to be
> accessible from both nodes (e.g., some jobs will use MPI over the
> CPUs/cores of both nodes ---assuming both nodes are up, of course ;-).
> 
> 
> 
> 
>> If you are doing virtualisation then look at Ganeti: this is an environment
>> which combines LVM plus DRBD and allows you to run VMs on either node and
>> live-migrate them from one to the other.
>> http://docs.ganeti.org/ganeti/current/html/
> 
> I am not doing virtualisation. I should have said that explicitly. 
> 
> 
>> If a node fails, you just restart the VMs on the other node and away you go.
> 
>>> 2. I plan on using a dedicated partition from each node as a brick. Should
>>>  I use replicated or distributed volumes?
> 
>> A distributed volume will only increase the size of storage available (e.g. 
>> combining two 600GB drives into one 1.2GB volume - as long as no single file
>> is too large).  If this is all you need, you'd probably be better off buying
>> bigger disks in the first place.
> 
>> A replicated volume allows you to have a copy of every file on both nodes
>> simultaneously, kept in sync in real time, and gives you resilience against
>> one of the nodes failing.
> 
> 
> But from the docs and the mailing list I get the impression that
> replication has severe performance penalties when writing and some
> penalties when reading. And with a two-node setup, it is unclear to me
> that, even with replication, if one node fails, gluster will continue to
> work (i.e., the other node will continue to work). I've not been able to
> find what is the recommended procedure to continue working, with
> replicated volumes, when one of the two nodes fails. So that is why I am
> wondering what would replication really give me in this case.
> 
> 
replicated volumes have a performance penalty on the client.  for instance, i have a replicated volume, with one replica on each of two nodes.  I'm front ending this with an ubuntu box running samba for cifs sharing.  if my windows client sends 100MB to the cifs server, the cifs server will send 100MB to each node in the replica set.  As for what you have to do to continue working if a node went down, i have tested this.  Not on purpose, but one of my nodes was accidentally downed.  my client saw no difference.  however, running 3.2.x, in order to get the client to use the downed node after it was brought back up, i had to remount the share on the cifs server.  this is supposedly fixed in 3.3.

It's important to note that self-healing will create files created while the node was offline, but does not DELETE files deleted while the node was offline.  not sure what the official line is there, but my use is archival, so it doesn't matter enough to me to run down (if they'd delete files, i wouldn't need gluster..)

> Best,
> 
> R.
> 
> 
> 
> 
>> Regards,
> 
>> Brian.
> -- 
> Ramon Diaz-Uriarte
> Department of Biochemistry, Lab B-25
> Facultad de Medicina 
> Universidad Aut?noma de Madrid 
> Arzobispo Morcillo, 4
> 28029 Madrid
> Spain
> 
> Phone: +34-91-497-2412
> 
> Email: rdiaz02 at gmail.com
>       ramon.diaz at iib.uam.es
> 
> http://ligarto.org/rdiaz
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users