On 08/10/2014 11:42 PM, Ravishankar N
wrote:
On 08/09/2014 01:23 AM, Joe Julian
wrote:
Thinking about it more, I'd still rather have this functionality
exposed at the client through xattrs. For 5 years I've thought
about this, and the more I encounter split-brain, the more I
think this is the needed approach.
Joe, why do you feel resolving split-brains should be exposed to
clients? Whatever approach is taken (either a gluster CLI command
or an overloaded get/satfattr call, is it not better to have this
done at the server side?)
* It's consistent with the way other functions actually operate,
rebalance, self-heal, etc. In that they're really just clients.
* On the client it offers more possibilities for us admins to be
able to fix something on the fly.
* It's an API at that point. Software could be coded to perform its
own self-heal based on the rules that might apply to that particular
use case.
* If multi-tenancy is ever added, it is a method by which the tenant
can repair his own files.
It was late, last time, and I missed one important operation. The
ability to mv one copy of the split-brain to a new filename in case
you choose wrongly and need it. I've seen that with VM images.
Typically, it doesn't really matter which VM image you chose (if
your data's in a smart place instead of on the image). Pick either
one and boot it back up. Occasionally, though, the image is
irreparable. Frequently, the "other copy" is ok, so if one fails to
boot, we swap to the other.
"getfattr -n trusted.glusterfs.stat" returns
xml/json/some_madeup_datastructure with the results of stat from
each brick
"getfattr -n trusted.glusterfs.afr" returns the afr matrix
"setfattr -n trusted.glusterfs.sb-pick -v "server2:/srv/brick1"
That gives us the tools we need to choose what to do with any
given split-brain. For large swaths of automated repair, we can
use find.
I suppose that last bit could still be implemented through that
cli command.
On 08/07/2014 01:35 AM, Ravishankar
N wrote:
Manual resolution of split-brains [1] has been a tedious task
involving understanding and modifying AFR's changelog extended
attributes. To simplify and to an extent automate this task,
we are proposing a new CLI command with which the user can
specify what the source brick/file is, and automatically heal
the files in the appropriate direction.
Command: gluster volume resolve-split-brain <VOLNAME>
{<bigger_file> | source-brick <brick_name>
[<file>] }
Breaking up the command into its possible options, we have:
a) gluster volume resolve-split-brain <VOLNAME>
<bigger_file>
When this command is executed, AFR will consider the brick
having the highest file size as the source and heal it to all
other bricks (including all other sources and sinks) in that
replica subvolume. If the file size is same in all the bricks,
it does *not* heal the file.
b) gluster volume resolve-split-brain <VOLNAME >
source-brick <brick_name > [<file>]
When this command is executed, if <file> is specified,
AFR heals the file from the source-brick <brick_name> to
all other bricks of that replica subvolume. For resolving
multiple files, the command must be run iteratively, once per
file.
If <file> is not specified, AFR heals all the files that
have an entry in .glusterfs/indices/xattrop *and* are in split-brain. As
before, heals happen from source-brick <brick_name> to
all other bricks.
Future work could also include extending the command to add
other policies like choosing the file having the latest mtime
as the source, integration with trash xlator wherein the files
deleted from the sink are moved to the trash dir etc.
Please give feedback on the above.
Regards,
Ravi
[1] https://github.com/gluster/glusterfs/blob/master/doc/split-brain.md
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-devel
|