> And that's why I really prefere gluster, without any metadata or > similiar. > But metadata server aren't mandatory to archive automatic rebalance. > Gluster is already able to rebalance and move data around the cluster, > and already has the tool to add a single server even in a replica 3. > > What i'm asking is to automate this feature. Gluster could be able to > move bricks around without user intervention. Some of us have thought long and hard about this. The root of the problem is that our I/O stack works on the basic of replicating bricks, not files. Changing that would be hard, but so is working with it. Most ideas (like Joe's) involve splitting larger bricks into smaller ones, so that the smaller units can be arranged into more flexible configurations. So, for example, let's say you have bricks X through Z each split in two. Define replica sets along the diagonal and place some files A through L. Brick X Brick Y Brick Z +---------+---------+---------+ Subdirectory 1 | A B C D | E F G H | I J K L | +---------+---------+---------+ Subdirectory 2 | I J K L | A B C D | E F G H | +---------+---------+---------+ Now you want to add a fourth brick on a fourth machine. Now each (divided) brick should contain three files instead of four, so some will have to move. Here's one possibility, based on our algorithms to maximize overlaps between the old and new DHT hash ranges. Brick X Brick Y Brick Z Brick W +---------+---------+---------+---------+ Subdirectory 1 | A B C | D E F | J K L | G H I | +---------+---------+---------+---------+ Subdirectory 2 | G H I | A B C | D E F | J K L | +---------+---------+---------+---------+ Even trying to minimize data motion, a third of all the files have to be moved. This can be reduced still further by splitting the original bricks into even smaller parts, and that actually meshes quite well with the "virtual nodes" technique used by other systems that do similar hash-based distribution, but it gets so messy that I won't even try to draw the pictures. The main point is that doing all this requires significant I/O, with significant impact on other activity on the system, so it's not necessarily true that we should just do it without user intervention. Can we automate this process? Yes, and we should. This is already in scope for GlusterD 2. However, in addition to the obvious recalculation and rebalancing, it also means setting up the bricks differently even when a volume is first created, and making sure that we don't double-count available space on two bricks that are really on the same disks or LVs, and so on. Otherwise, the initial setup will seem simple but later side-effects could lead to confusion. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users