On 18/04/2016 4:09 PM, Pranith Kumar Karampuri wrote:
Since you are probably the first user who is putting sharding in
production, you may be the first person to run into some issues which
noone faced till now.
Eeep! Now I feel nervous :)
Actually I think Kevin Lemonnier is using sharding with 3.7.6
We want to make sure all your questions/problems are addressed.
Thanks! we're on day two with no problems so far, I moved another VM to
the volume.
I also killed the gluster processes on one node to simulate a node
failure. VM's continued running without a hitch, no one noticed. Kept it
that way for a few minutes, heal count got up to 400 64MB shards before
I restarted the services. The heal process took around 16 minutes which
was very pleasing. iowaits stayed below 5%.
One of the things I will be doing is stopping the volume over the
weekend and running a compare file between the nodes for a consistency
check. What I do on each node is:
1. md5deep <brick path>/.shard > md5sum.txt
2. cat md5sum|sort > md5sum.txt.sorted
3. md5sum md5sum.txt.sorted > brick.md5sum
4. Compare brick.md5sum accross the nodes, should be identical.
md5deep generates md5sums for all the files in a directory, then I sort
the result as they aren't necessarily in the same order across nodes.
Unfortunately it does take a very long time to run :( If you know of a
faster way to compare 810GB of data that would be great :)
Cheers,
--
Lindsay Mathieson
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users