Best practices?

Greg_Swift at aotx.uscourts.gov (Greg_Swift at aotx.uscourts.gov) · Mon, 23 Jan 2012 15:54:45 -0600

gluster-users-bounces at gluster.org wrote on 01/22/2012 04:17:02 PM:

>
> Suppose I start building nodes with (say) 24 drives each in them.
>
> Would the standard/recommended approach be to make each drive its own
> filesystem, and export 24 separate bricks, server1:/data1 ..
> server1:/data24 ?  Making a distributed replicated volume between this
and
> another server would then have to list all 48 drives individually.
>
> At the other extreme, I could put all 24 drives into some flavour of
stripe
> or RAID and export a single filesystem out of that.
>
> It seems to me that having separate filesystems per disk ould be the
easiest
> to understand and to recover data from, and allow volume 'hot spots' to
be
> measured and controlled, at the expense of having to add each brick
> separately into a volume.
>
> I was trying to find some current best-practices or system design
guidelines
> on the wiki, but unfortunately a lot of what I find is marked "out of
date",
> e.g.
> http://gluster.org/community/documentation/index.php/
> Guide_to_Optimizing_GlusterFS
> http://gluster.org/community/documentation/index.php/Best_Practices_v1.3
> [the latter is not marked out of date, but links to pages which are]
>
> Also the glusterfs3.2 admin guide seems to dodge this issue, assuming you
> already have your bricks prepared before telling you how to add them into
a
> volume.
>
> But if you can point me at some recommended reading, I'd be more than
happy
> to read it :-)

Its been talked about a few times on the list in abstract but I can give
you one lesson learned from our environment.

the volume to brick ratio is a sliding scale.  you can can have more of
one, but then you need to have less of the other.

So taking your example above:

2 nodes
24 disks per node

Lets put that out into possible configurations:

2 nodes
24 bricks per node per volume
1 volume
---------
= 24 running processes and 24 ports per node

2 nodes
24 bricks per node per volume
100 volumes
---------
= 2400 running processes and 2400 ports per node

2 nodes
1 brick per node per volume
24 volumes
---------
= 24 running processes and 24 ports per node

2 nodes
1 brick per node per volume
2400 volumes
---------
= 2400 running processes and 2400 ports per node

More process/ports means more potential for ports in use, connectivity
issues, file use limits (ulimits), etc.

thats not the only thing to keep in mind, but its a poorly documented one
that burned me so :)