Hello community of Gluster, Sorry for the long post. TL;DR: stock gluster 3.3.0 on 5 nodes results in massive data corruption on brick "failure" or peer disconnection. We are having problems with data corruption on VM volumes with VMs running on top of Gluster 3.3.0 when introducing brick failures and/or node disconnects. In our setup we have 5 storage nodes with 16 core AMD Opteron(tm) Processor 6128, 32GB ram and 34 2TB SATA disks. To utilize the the storage nodes we have 20 compute nodes with 24 core AMD Opteron(TM) Processor 6238 and 128GB ram. To be able to test and verify this setup we installed Gluster 3.3.0 on the storage nodes and GlusterFS 3.3.0 client on the compute nodes. We created one brick for each hard drive and created Distributed-Replicate volume with the bricks with tcp,rdma transport. The volume was mounted with glusterfs over tcp transport over infiniband on all the compute nodes. We created 500 virtual machines on the compute nodes and made them do heavy IO benchmarking on the volume and Gluster performed as expected. Then we created sanity test script that creates files, copies over and over again and does md5 sums of all written data and does md5 check of all the operating system. We ran this test on all the VMs successfully, then we did it again and stopped one storage node for few minutes and started it again, which gluster recovered from successfully. Then we ran this test again but with kill -9 on all Gluster processes on one node for more than an hour. We kept the tests running to emulate load and then started the Gluster deamon on the storage node again. Now around 10% of all VMs lost connection to Gluster and failed to "read-only" file-system and more instances got some data corruption, missing or broken files. Very bad! We wiped the VMs and created new ones instead. Started the same test again but now we terminated 4 bricks on one node and carried out load testing to test shrinking and re-balancing. Before we got the chance to remove/move bricks we started getting bunch of corrupted VMs and data corruption and after re-balancing we got a load of kernel panics on the VMs. Very bad indeed! Are anyone else having the same problem, is there anything we are doing wrong, is this lost cause? Thanks for any input. -Bob