Re: Gluster infrastructure question

Joe Julian <joe@xxxxxxxxxxxxxxxx> · Mon, 09 Dec 2013 14:09:11 -0800

Replicas are defined in the order bricks are listed in the volume create 
command. So
  gluster volume create myvol replica 2 server1:/data/brick1 
server2:/data/brick1 server3:/data/brick1 server4:/data/brick1
will replicate between server1 and server2 and replicate between server3 
and server4.

Bricks added to a replica 2 volume after it's been created will require 
pairs of bricks,

The best way to "force" replication to happen on another server is to 
just define it that way.

On 12/09/2013 01:58 PM, Dan Mons wrote:
I went with big RAID on each node (16x 3TB SATA disks in RAID6 with a
hot spare per node) rather than brick-per-disk.  The simple reason
being that I wanted to configure distribute+replicate at the GlusterFS
level, and be 100% guaranteed that the replication happened across to
another node, and not to another brick on the same node.  As each node
only has one giant brick, the cluster is forced to replicate to a
separate node each time.

Some careful initial setup could probably have done the same, but I
wanted to avoid the dramas of my employer expanding the cluster one
node at a time later on, causing that design goal to fail as the new
single node with many bricks found replication partners on itself.

On a different topic, I find no real-world difference in RAID10 to
RAID6 with GlusterFS.  Most of the access delay in Gluster has little
to do with the speed of the disk.  The only downside to RAID6 is a
long rebuild time if you're unlucky enough to blow a couple of drives
at once.  RAID50 might be a better choice if you're up at 20 drives
per node.

We invested in SSD caching on our nodes, and to be honest it was
rather pointless.  Certainly not bad, but the real-world speed boost
is not noticed by end users.

-Dan

----------------
Dan Mons
R&D SysAdmin
Unbreaker of broken things
Cutting Edge
http://cuttingedge.com.au

On 10 December 2013 05:31, Ben Turner <bturner@xxxxxxxxxx> wrote:
----- Original Message -----
From: "Ben Turner" <bturner@xxxxxxxxxx>
To: "Heiko Krämer" <hkraemer@xxxxxxxxxxx>
Cc: "gluster-users@xxxxxxxxxxx List" <gluster-users@xxxxxxxxxxx>
Sent: Monday, December 9, 2013 2:26:45 PM
Subject: Re:  Gluster infrastructure question

----- Original Message -----
From: "Heiko Krämer" <hkraemer@xxxxxxxxxxx>
To: "gluster-users@xxxxxxxxxxx List" <gluster-users@xxxxxxxxxxx>
Sent: Monday, December 9, 2013 8:18:28 AM
Subject:  Gluster infrastructure question

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Heyho guys,

I'm running since years glusterfs in a small environment without big
problems.

Now I'm going to use glusterFS for a bigger cluster but I've some
questions :)

Environment:
* 4 Servers
* 20 x 2TB HDD, each
* Raidcontroller
* Raid 10
* 4x bricks => Replicated, Distributed volume
* Gluster 3.4

1)
I'm asking me, if I can delete the raid10 on each server and create
for each HDD a separate brick.
In this case have a volume 80 Bricks so 4 Server x 20 HDD's. Is there
any experience about the write throughput in a production system with
many of bricks like in this case? In addition i'll get double of HDD
capacity.
Have a look at:

http://rhsummit.files.wordpress.com/2012/03/england-rhs-performance.pdf
That one was from 2012, here is the latest:

http://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf

-b

Specifically:

● RAID arrays
● More RAID LUNs for better concurrency
● For RAID6, 256-KB stripe size

I use a single RAID 6 that is divided into several LUNs for my bricks.  For
example, on my Dell servers(with PERC6 RAID controllers) each server has 12
disks that I put into raid 6.  Then I break the RAID 6 into 6 LUNs and
create a new PV/VG/LV for each brick.  From there I follow the
recommendations listed in the presentation.

HTH!

-b

2)
I've heard a talk about glusterFS and out scaling. The main point was
if more bricks are in use, the scale out process will take a long
time. The problem was/is the Hash-Algo. So I'm asking me how is it if
I've one very big brick (Raid10 20TB on each server) or I've much more
bricks, what's faster and is there any issues?
Is there any experiences ?

3)
Failover of a HDD is for a raid controller with HotSpare HDD not a big
deal. Glusterfs will rebuild automatically if a brick fails and there
are no data present, this action will perform a lot of network traffic
between the mirror bricks but it will handle it equal as the raid
controller right ?

Thanks and cheers
Heiko

- --
Anynines.com

Avarteq GmbH
B.Sc. Informatik
Heiko Krämer
CIO
Twitter: @anynines

- ----
Geschäftsführer: Alexander Faißt, Dipl.-Inf.(FH) Julian Fischer
Handelsregister: AG Saarbrücken HRB 17413, Ust-IdNr.: DE262633168
Sitz: Saarbrücken
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSpcMfAAoJELxFogM4ixOF/ncH/3L9DvOWHrF0XBqCgeT6QQ6B
lDwtXiD9xoznht0Zs2S9LA9Z7r2l5/fzMOUSOawEMv6M16Guwq3gQ1lClUi4Iwj0
GKKtYQ6F4aG4KXHY4dlu1QKT5OaLk8ljCQ47Tc9aAiJMhfC1/IgQXOslFv26utdJ
N9jxiCl2+r/tQvQRw6mA4KAuPYPwOV+hMtkwfrM4UsIYGGbkNPnz1oqmBsfGdSOs
TJh6+lQRD9KYw72q3I9G6ZYlI7ylL9Q7vjTroVKH232pLo4G58NLxyvWvcOB9yK6
Bpf/gRMxFNKA75eW5EJYeZ6EovwcyCAv7iAm+xNKhzsoZqbBbTOJxS5zKm4YWoY=
=bDly
-----END PGP SIGNATURE-----

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users