Anand Avati wrote:
Gordan, and others using a config like Bug 542 (http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=542) This corruption issue can shows up (but not always) when you have loaded both io-cache and write-behind below replicate (syntactically before replicate in the volfile) and a self-heal of a file bigger than 131072 bytes happens. Gordan, we believe this is why your corruption observations are strongly correlated to server reconnections. Please use write-behind and io-cache on top of replicate (the "normal" way, the way glusterfs-volgen would generate), and you will not face this problem. I believe the reason for using io-cache and write-behind below replicate is for improving self-heal performance - for which we suggest using 3.0.x release where we have background self-healing and diff based self-healing. Please read http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=542#c4 for details about the internals of the corruption. In summary, loading the performance (specifically io-cache) translators in the normal location will give you a quick remedy from the bug.
I had observed this corruption issue on 2.0 branch WITHOUT any performance translators, so there is also something else going wrong.
Gordan