Re: detecting replication issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



HI Mohammed,

Its not a bug per se, its a configuration and documentation issue. I searched the gluster documentation pretty thoroughly and I did not find anything that discussed the 1) client's call graph and 2) how to specifically configure a native glusterfs client to properly specify that call graph so that replication will happen across multiple bricks. If its there, then there's a pretty severe organization issue in the documentation (I am pretty sure I ended up reading almost every page actually).

As a result, because I was a new to gluster, my initial set up really confused me. I would follow the instructions as documented in official gluster docs (execute the mount command), write data on the mount...and then only see it replicated to a single brick. It was only after much furious googling did I manage to figure out that that 1) i needed a client configuration file which should be specified in /etc/fstab and 2) that configuration block mentioned above was the key.

I am actually planning on submitting a PR to the documentation to cover all this. To be clear, I am sure this is obvious to a seasoned gluster user -- but it is not at all obvious to someone who is new to gluster such as myself.

So I am an operations engineer. I like reproducible deployments and I like monitoring to alert me when something is wrong. Due to human error or a bug in our deployment code, its possible that something like not setting the client call graph properly could happen. I wanted a way to detect this problem so that if it does happen, it can be remediated immediately.

Your suggestion sounds promising. I shall definitely look into that. Though that might be a useful information to surface up in a CLI command in a future gluster release IMHO.

Joe



On Thu, Feb 23, 2017 at 11:51 PM, Mohammed Rafi K C <rkavunga@xxxxxxxxxx> wrote:



On 02/23/2017 11:12 PM, Joseph Lorenzini wrote:
Hi all,

I have a simple replicated volume with a replica count of 3. To ensure any file changes (create/delete/modify) are replicated to all bricks, I have this setting in my client configuration.

 volume gv0-replicate-0
    type cluster/replicate
    subvolumes gv0-client-0 gv0-client-1 gv0-client-2
end-volume

And that works as expected. My question is how one could detect if this was not happening which could poise a severe problem with data consistency and replication. For example, those settings could be omitted from the client config and then the client will only write data to one brick and all kinds of terrible things will start happening. I have not found a way the gluster volume cli to detect when that kind of problem is occurring. For example gluster volume heal <volname> info does not detect this problem. 

Is there any programmatic way to detect when this problem is occurring?


I couldn't understand how you will end up in this situation. There is only one possibility (assuming there is no bug :) ), ie you changed the client graph in a way that there is only one subvolume to replica server.

To check that the simply way is, there is a xlator called meta, which provides meta data information through mount point, similiar to linux proc file system. So you can check the active graph through meta and see the number of subvolumes for replica xlator

for example : the directory   <mount point>/.meta/graphs/active/<volname>-replicate-0/subvolumes will have entries for each replica clients , so in your case you should see 3 directories.


Let me know if this helps.

Regards
Rafi KC


Thanks,
Joe





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux