Re: Another Data Corruption Report

Gordan Bobic <gordan@xxxxxxxxxx> · Mon, 08 Feb 2010 10:44:36 +0000

Anand Avati wrote:
Gordan, and others using a config like Bug 542
(http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=542)

This corruption issue can shows up (but not always) when you have
loaded both io-cache and write-behind below replicate (syntactically
before replicate in the volfile) and a self-heal of a file bigger than
131072 bytes happens. Gordan, we believe this is why your corruption
observations are strongly correlated to server reconnections.

Please use write-behind and io-cache on top of replicate (the "normal"
way, the way glusterfs-volgen would generate), and you will not face
this problem. I believe the reason for using io-cache and write-behind
below replicate is for improving self-heal performance - for which we
suggest using 3.0.x release where we have background self-healing and
diff based self-healing.

Please read http://bugs.gluster.com/cgi-bin/bugzilla3/show_bug.cgi?id=542#c4
for details about the internals of the corruption.

In summary, loading the performance (specifically io-cache)
translators in the normal location will give you a quick remedy from
the bug.

I had observed this corruption issue on 2.0 branch WITHOUT any 
performance translators, so there is also something else going wrong.

Gordan