Re: Remove an artificial limitation of disperse volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Olivier,

sorry, didn't see the email earlier...

We've already talked about this in private, but to make things clearer to everyone I answer here.

On 07/02/17 14:16, Olivier Lambert wrote:
Hi everyone!

I'm currently working on implementing Gluster on XenServer/Xen Orchestra.

I want to expose some Gluster features (in the easiest possible way to
the user).

Therefore, I want to expose only "distributed/replicated" and
"disperse" mode. From what I understand, they are working differently.
Let's take a simple example.

Setup: 6x nodes with 1x 200GB disk each.

* Disperse with redundancy 2 (4+2): I can lose **any 2 of all my
disks**. Total usable space is 800GB. It's a kind of RAID6 (or RAIDZ2)
* Distributed/replicated with replica 2: I can lose 2 disks **BUT**
not on the same "mirror". Total usable space is 600GB. It's a kind of
RAID10

So far, is it correct?

Yes, but sometimes you can gain some performance by splitting each disk into two bricks if the disks are not the bottleneck.


My main point is that behavior is very different (pairing disks in
distributed/replicated and "shared" parity in disperse).

Now, let's imagine something else. 4x nodes with 1x 200GB disk each.

Why not having disperse with redundancy 2? It will be the same in
terms of storage space than distributed/replicated, **BUT** in
disperse I can lose any of 2 disks. In dist/rep, only if they are not
on the same "mirror".

So far, I can't create a disperse volume if the redundancy level is
50% or more the number of bricks. I know that perfs would be better in
dist/rep, but what if I prefer anyway to have disperse?

Conclusion: would it be possible to have a "force" flag during
disperse volume creation even if redundancy is higher that 50%?

That's a design decision made to avoid most of the split-brains and thinking that 50% redundancy is already achieved by replicate (even if the conditions are not really the same).

The Reed-Solomon algorithm is able to create as many or even more redundancy fragments as there are data bricks (the only real limitation is the Galois Field used). However allowing this in disperse had a lot of complex scenarios that are both difficult to solve and prone to possible failures/data corruptions. So it was decided to not support those configurations.

Xavi




Thanks!



Olivier.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux