On 09/07/13 15:38, Bobby Jacob wrote: > Hi, > > I have a 2-node gluster with 3 TB storage. > > 1)I believe the ?glusterfsd? is responsible for the self healing between > the 2 nodes. > > 2)Due to some network error, the replication stopped for some reason but > the application was accessing the data from node1. When I manually try > to start ?glusterfsd? service, its not starting. > > Please advice on how I can maintain the integrity of the data so that we > have all the data in both the locations. ?? There were some bugs in the self-heal daemon present in 3.3.0 and 3.3.1. Our systems see the SHD crash out with segfaults quite often, and it does not recover. I reported this bug a long time ago, and it was fixed in trunk relatively quickly -- however version 3.3.2 has still not been released, despite the fix being found six months ago. I find this quite disappointing. T