Am testing replacing the brick in a replica 3 test volume.
Gluster 3.7.11. Volume hosts two VM's. 3 Nodes, vna, vnb and vng.
First off I tried removing/adding a brick. gluster v remove-brick replica 2 vng.proxmox.softlog:/tank/vmdata/test1 force. That worked fine, VM's (on another node) kept running without a
hiccup I deleted /tank/vmdata/test1, then gluster v add-brick replica 3 vng.proxmox.softlog:/tank/vmdata/test1 force. Succeeded and heal statistics immediatly showed 3000+ shards being healed on vna and vnb Unfortunately it also show 100's of sharded being healed on vng, which should not be happening as it had no data on it. Reverse heal basically. Eventually all the heals completed, but the VM's were hopeless ccorrupted. Then I retried the above, but with all VM's shutdown i.e, no writes or reads happening on the volume. This worked - i.e all the shards on vna & vnb healed, nothing in reverse. Once completed the data (VM's) was fine. Unfortunately this isn't practical in production - can' bring all the VM's down for the 1-2 days it would take to heal. Replacing the brick I tried killed the glusterfsd process on vng, then gluster v replace-brick test1 vng.proxmox.softlog:/tank/vmdata/test1 vng.proxmox.softlog:/tank/vmdata/test1.1 commit force vna & vnb shards started healing, but vng showed 5 reverse heals happening. Eventually it got down to 4-5 shards needing healing on each brick and stopped. They didn't go away till I removed the test1.1 brick. Currently the replace brick processes seems to be unusable except when the volume is not being used. -- Lindsay Mathieson |
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users