Description of problem:
My setup has 5 gluster volumes, and each of them has
2 bricks as backend.
When I copy a large file (100MB) in a gluster volume,
9/10 times it works OK. But about 1 in 10 times the
resulting md5 is wrong. After checking I found that the
file in one brick has the correct md5sum, while the file
in the other brick has a wrong md5sum.
The size of the two files is the same.
By running "cmp -l <correct_file>
<wrong_file>"
I found that the difference was in 49 bytes. So the
files in the two bricks had the same size, but 49 files
were different. Interestingly enough I saw the same
number of 49 bytes being different at every check that I
made.
Do you know what might cause this behavior, has
anyone seen something like this before? Is this a bug in
glusterfs?
Version-Release number of selected component (if
applicable):
glusterfs 3.7.5 built on Nov 19 2015 16:29:59
Repository revision:
git://git.gluster.com/glusterfs.git
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the
terms of the GNU General Public License.
How reproducible:
Not easy to reproduce, about 1 in 10 times in some
environments, not reproducible at all in other
environments.
Steps to Reproduce:
1. scp <100MB file> <path in gluster
volume>
Actual results:
1. md5sum of destination should be the same as the
source
2. If checksum of files is different between the two
bricks, the command "gluster volume heal
<vol-name> info split-brain" should return that
the two bricks are in split-brain.
Expected results:
1. 1 in 10 times the destination file has incorrect
checksum. Size is the same, but 49 bytes are altered.
2. "gluster volume heal <vol-name> info
split-brain" does not return that the bricks are in
split-brain, even though the checksum of the file in the
two bricks is different. The size of the file is the
same in the two bricks. But 49 bytes are altered.
Additional info: