Re: Need help to design a data storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Yes Gandalf, I think you are missing a point, the way we configure EC.
To explain that I would like to take less number of disks. Lets say you have 6 disk of 1TB each on 6 different nodes.

1- Replica 2 using gluster
There will be 3 sub volume of replica - afr-1, afr-2,  afr-3 each with pair of 2 disk.
A file name file.txt will be saved on 2 disks of any one sub volume. That means you are cutting the storage space to half - 3TB
Also at any point of time you can afford to kill only 1 brick.

2- Replica 3 using gluster
There will be 2 sub volume of replica - afr-1, afr-2, each with 3 disks.
A file name file.txt will be saved on 3 disks of any one sub volume lets say afr-1. That means you are cutting the storage space to 1/3rd  - 2TB
Also at any point of time you can afford to kill only 2 bricks of afr-1.

3 - EC with redundancy 2 that is 4+2
The over all storage space you get is 4TB and any 2 bricks can be down at any point of time. So it is as good as replica 3 but providing more space.

Now when you give example of 108 disks. You should not have 106+2 configuration  of EC as you were saying. That is really very poor setup and you are right about redundancy.
But when you say that you will create replica 3 that means you will have 108/3 = 36 sub volume each with on 1TB storage space.  So you will get 36TB of storage if each brick is of 1TB.

So I would say that you should create EC with configuration of 4+2. There will be 108/6 = 18 sub volumes each with  4TB of capacity. Total space you get is 18 X 4 = 72TB of data.

In both the above cases even if you kill 2 bricks from the same volume, data will be served. If you kill 3rd brick from the same sub volume you will loose data in  replica as well as in EC sub volume.

----
Ashish






From: "Gandalf Corvotempesta" <gandalf.corvotempesta@xxxxxxxxx>
To: "Ashish Pandey" <aspandey@xxxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx
Sent: Tuesday, August 9, 2016 8:33:31 PM
Subject: Re: Need help to design a data storage

Il 09 ago 2016 10:06 AM, "Ashish Pandey" <aspandey@xxxxxxxxxx> ha scritto:
> If your main concern is data redundancy, I would suggest you to go for erasure coded volume provided by gluster.

Anyway EC volumes has a lower redundancy level than standard replicated volumes.

Let's assume a 9 nodes cluster with 12 disks on each node, redundancy set to 2

You have 9*12 = 108 disks/bricks
with redundancy 2 you can loose up to 2 bricks/disks at the same time before loosing data. Using cheap sata disks (gluster is made to run on commodity hardware) loosing 3 disks over 108 in a very short time could happen frequently and this frequency grow as cluster grows

With a standard replicated volume,  with replica 3, you can loose up to 3 servers (not bricks) because each brick in a replica set must be on a different server.

I think EC is something like raid6 (with more "parity") and standard replication is like raid10 but with 3 disks for each mirror.

Raid10 is safer as you can loose as many disks as you want,  if in different replica set, while raid 6 can loose up to 2 disks in the whole cluster
Higher the number of disks, higher the probability of data loss with raid6/EC

Am i missed something?


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux