On 05/09/2015 03:19 PM, Krishnan Parthasarathi wrote:
Why not break glusterd into small parts and distribute the load to
different people? Did you guys plan anything for 4.0 for breaking glusterd?
It is going to be a maintenance hell if we don't break it sooner.
Good idea. We have thought about it. Just re-architecting glusterd doesn't
(and will not) solve the division of responsibility issue that is being discussed here.
It's already difficult to maintain glusterd. I have already explained the reasons
in the previous thread.
I was thinking *-cli xlators could be maintained by the respective fs
team itself. It is easier to maintain it this way because each of those
xls can be put in xlators/cluster/afr/cli, xlators/cluster/dht/cli, etc.
There will be clear demarcation of who owns what this way is my feeling.
Even the tests can be organized to tests/afr-cli, tests/dht-cli etc etc.
Glusterd does a lot of things: Lets see how we can break things up one
thing at a time. I would love to spend some quality time thinking about
this problem once I am done with ec work, but this is a rough idea I
have for glusterd.
1) CLI handling:
Glusterd-cli-xlator should act something like fuse in fs. It just gets
the commands and passes it down, just like fuse gets the fops and passes
it down. In glusterd process there should be snapshot.so, afr-cli.so,
ec-cli.so, dht-cli.so loaded as management-xlators.
Just like we have fops lets have mops (management operations).
LOCK/STAGE/BRICK-OP/COMMIT-OP if there are more add them as well. Every
time the top xlator in glusterd receives commands from cli, it converts
the params into the arguments (req, op, dict etc) which are needed to
carryout the cli. Now it winds the fop to all its children. One of the
children is going to handle it locally, while the other child will send
the cli to different glusterds that are in cluster. Second child of
gluster-cli-xlator (give it a better name, but for now lets call it:
mgmtcluster) will collate the responses and give the list of responses
to glusterd-cli-xlator, it will call COLLATE mop on the first-child(lets
call it local-handler) to collate the responses, i.e. logic for
collating responses should also be in snapshot.so, afr-cli.so,
dht-cli.so etc etc. Once the top translator does LOCK, STAGE, BRICK-OP,
COMMIT-OP send response to CLI.
2) Volinfo should become more like inode_t in fs where each *-cli xlator
can store their own ctx like snapshot-cli can store all snapshot related
info for that volume in that context and afr can store afr-related info
in the ctx. Volinfo data strcuture should have very minimal information.
Maybe name, bricks etc.
3) Daemon handling:
Daemon-manager xlator should have MOPS like START/STOP/INFO and
this xlator should be accessible for all the -cli xlators which want to
do their own management of the daemons. i.e. ec-cli/afr-cli should do
self-heal-daemon handling. dht should do rebalance process handling etc.
to give an example:
while winding START mop it has to specify the daemon as
"self-heal-daemon" and give enough info etc.
4) Peer handling:
mgmtcluster(second child of top-xlator) should have MOPS like
PEER_ADD/PEER_DEL/PEER_UPDATE etc to do the needful. top xlator is going
to wind these operations based on the peer-cli-commands to this xlator.
5) volgen:
top xlator is going to wind MOP called GET_NODE_LINKS, which takes
the type of volfile (i.e. mount/nfs/shd/brick etc) on which each *-cli
will construct its node(s), stuff options and tell the parent xl-name to
which it needs to be linked to. Top xlator is going to just link the
nodes to construct the graph and does graph_print to generate the volfile.
I am pretty sure I forgot some more aspects of what glusterd does but
you get the picture right? Break each aspect into different xlator and
have MOPS to solve them.
We have some initial ideas on how glusterd for 4.0 would look like. We won't be
continuing with glusterd is also a translator model. The above model would
work well only if we stuck with the stack of translators approach.
Oh nice, I might have missed the mails. Do you mind sharing the plan for
4.0? Any reason why you guys do not want to continue glusterd as
translator model?
Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel