Re: Automated split-brain resolution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 08/10/2014 11:42 PM, Ravishankar N wrote:
On 08/09/2014 01:23 AM, Joe Julian wrote:
Thinking about it more, I'd still rather have this functionality exposed at the client through xattrs. For 5 years I've thought about this, and the more I encounter split-brain, the more I think this is the needed approach.


Joe, why do you feel resolving split-brains should be exposed to clients? Whatever approach is taken (either a gluster  CLI command or an overloaded get/satfattr call, is it not better to have this done at the server side?)

* It's consistent with the way other functions actually operate, rebalance, self-heal, etc. In that they're really just clients.
* On the client it offers more possibilities for us admins to be able to fix something on the fly.
* It's an API at that point. Software could be coded to perform its own self-heal based on the rules that might apply to that particular use case.
* If multi-tenancy is ever added, it is a method by which the tenant can repair his own files.

It was late, last time, and I missed one important operation. The ability to mv one copy of the split-brain to a new filename in case you choose wrongly and need it. I've seen that with VM images. Typically, it doesn't really matter which VM image you chose (if your data's in a smart place instead of on the image). Pick either one and boot it back up. Occasionally, though, the image is irreparable. Frequently, the "other copy" is ok, so if one fails to boot, we swap to the other.

"getfattr -n trusted.glusterfs.stat" returns xml/json/some_madeup_datastructure with the results of stat from each brick
"getfattr -n trusted.glusterfs.afr" returns the afr matrix
"setfattr -n trusted.glusterfs.sb-pick -v "server2:/srv/brick1"

That gives us the tools we need to choose what to do with any given split-brain. For large swaths of automated repair, we can use find.

I suppose that last bit could still be implemented through that cli command.


On 08/07/2014 01:35 AM, Ravishankar N wrote:

Manual resolution of split-brains [1] has been a tedious task involving understanding and modifying AFR's changelog extended attributes. To simplify and to an extent automate this task, we are proposing a new CLI command with which the user can  specify  what the source brick/file is, and automatically heal the files in the appropriate direction.

Command: gluster volume resolve-split-brain <VOLNAME> {<bigger_file>  |  source-brick <brick_name> [<file>] }

Breaking up the command into its possible options, we have:

a) gluster volume resolve-split-brain <VOLNAME> <bigger_file>
When this command is executed, AFR will consider the brick having the highest file size as the source and heal it to all other bricks (including all other sources and sinks) in that replica subvolume. If the file size is same in all the bricks, it does *not* heal the file.

b) gluster volume resolve-split-brain <VOLNAME > source-brick <brick_name > [<file>]

When this command is executed, if <file> is specified, AFR heals the file from the source-brick <brick_name> to all other bricks of that replica subvolume. For resolving multiple files, the command must be run iteratively, once per file.
If <file> is not specified, AFR heals all the files that have an entry in .glusterfs/indices/xattrop *and* are in split-brain. As before, heals happen from source-brick <brick_name> to all other bricks.

Future work could also include extending the command to add other policies like choosing the file having the latest mtime as the source, integration with trash xlator wherein the files deleted from the sink are moved to the trash dir etc.

Please give feedback on the above.

Regards,
Ravi

[1] https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux