Re: Add single server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/01/2017 01:52 PM, Gandalf Corvotempesta wrote:
2017-05-01 19:50 GMT+02:00 Shyam <srangana@xxxxxxxxxx>:
Splitting the bricks need not be a post factum decision, we can start with
larger brick counts, on a given node/disk count, and hence spread these
bricks to newer nodes/bricks as they are added.

If I understand the ceph PG count, it works on a similar notion, till the
cluster grows beyond the initial PG count (set for the pool) at which point
there is a lot more data movement (as the pg count has to be increased, and
hence existing PGs need to be further partitioned)

Exactly.
Last time i've used ceph, the PGs worked in a similiar way.


Expanding on this notion, the considered brick-splitting needs some other enhancements that can retain the replication/availability count, when moving existing bricks from one place to another. Thoughts on this are posted here [1].

In essence we are looking at "+1 scaling" (what that +1 is, a disk, a node,... is not set in stone yet, but converging at a disk is fine as an example). +1 scaling involves,
 a) ability to retain replication/availability levels
 b) optimal data movement
c) optimal/acceptable time before which added capacity is available for use (by the consumer of the volume)
 d) is there a (d)? Would help in getting the requirement clear...

Brick splitting can help with (b) and (c), with strategies like [1] for (a), IMO.

Brick splitting also brings in complexities in DHT (like looking up everywhere, or the scale count of distribution that would increase). Such complexities have some solutions (like lookup optimize), and possibly needs some testing and bench marking to ensure it does not trip at this layer.

Also, brick multiplexing is already in the code base, which is to deal with large(r) number of bricks per node. Which would be the default with brick splitting and hence would help.

Further, the direction with JBR, needed a leader per node for a brick (so that clients are utilizing all server connections than just the leader) and was possibly the birth place for brick splitting thought.

Also, the ideas behind larger bucket counts for DHT2 than real bricks was to deal with (b).

Why I put this story together is to state 2 things,
- We realize that we need this, and have been working on strategies towards achieving the same - We need the bits chained right, so that we can make this work and there is substantial work to be done here

Shyam

[1] Moving a brick in pure dist/replica/ec setup to another within or across nodes thoughts (my first comment on this issue, github does not have a comment index for me to point to the exact comment): https://github.com/gluster/glusterfs/issues/170
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux