Re: Unnecessary healing in 3-node replication setup on reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 16.10.2015 um 18:51 schrieb Vijay Bellur:

self-healing in gluster by default syncs only modified parts of the files from a source node. Gluster does a rolling checksum of a file needing self-heal to identify regions of the file which need to be synced over the network. This rolling checksum computation can sometimes be expensive and there are plans to have a lighter self-healing in 3.8 with more granular changelogs that can do away with the need to do a rolling checksum.

I did some tests (see below) - could you please check this and tell me if this is normal?


For example, I have a 200GB VM disk image in my volume (the biggest file). About 75% of that disk is currently unused space and writes are only about 50 kbytes/sec.
That 200GB disk image always "heals" a very long time (at least 30 minutes) - even if I'm pretty sure only a few blocks could have been changed.


Anyway, I just rebooted a node (about 2-3 minutes downtime) to collect some information:
  • In total I have about 790GB* files in that Gluster volume
  • about 411GB* belong to active VM HDD images, the remaining are backup/template files
  • only VM HDD images are being healed (max 15 files)
  • while healing, glusterfsd shows varying CPU usages between 70% and 650% (it's a 16 cores server); total 106 minutes CPU time once healing completed
  • once healing completes, the machine received a total of 7.0 GB and sent 3.6 GB over the internal network (so, yes, you're right that not all contents are transferred)
  • total heal time: whopping 58 minutes
* these are summed up file sizes; "du" and "df" commands show smaller usage

Node details (all 3 nodes are identical):
  • DELL PowerEdge R730
  • Intel Xeon E5-2600 @ 2.4GHz
  • 64 GB DDR4 RAM
  • the server is able to gzip-compress about 1 GB data / second (all cores together)
  • 3 TB HW-RAID10 HDD  (2.7TB reserved for Gluster); minimum 500 MB/s write speed, 350 MB/s read speed
  • redundant 1GBit/s internal network
  • Debian 7 Wheezy / Proxmox 3.4, Kernel 2.6.32, Gluster 3.5.2
Volume setup:
 # gluster volume info systems

Volume Name: systems
Type: Replicate
Volume ID: b2d72784-4b0e-4f7b-b858-4ec59979a064
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: metal1:/data/gluster/systems
Brick2: metal2:/data/gluster/systems
Brick3: metal3:/data/gluster/systems
Options Reconfigured:
cluster.server-quorum-ratio: 51%
Note that `gluster volume heal "systems" info` takes 3-10 seconds to complete during heal - I hope that doesn't slow down healing since I tend to run that command frequently.


Would you expect these results or is something wrong?

Would upgrading to Gluster 3.6 or 3.7 improve healing performance?

Thanks,
Udo

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux