On Thu, Oct 1, 2015 at 2:24 AM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote: > hi, > In releases till now from day-1 with replication, there is a corner > case bug which can wipe out the all the bricks in that replica set when the > disk/brick(s) are replaced. > > Here are the steps that could lead to that situation: > 0) Clients are operating on the volume and are actively pumping data. > 1) Execute replace-brick command (OR) take down the brick that needs the > disk replaced or re-formatted. and bring the brick back up. So the better course of action would be to remove-brick <vol> replica <n-1> start, replace the disk, and then add-brick <vol> replica <n+1> ? Perhaps it would be wise to un-peer the host before adding the brick back? Is there any chance that adding a 3rd replica to a 2 replica cluster with active client writes could cause the same issue? On 3.7.3 I recently lost 2 of 3 bricks all the way down to the XFS filesystem being corrupted, but I blamed that on the disk controller which was doing a raid0 pass-through on 2 hosts, but not the new 3rd host. This occurred after some time though, and client writes were being blocked while the 3rd brick was being added. > 2) Client creates a file/directory just on the root of the brick which > succeeds on the new brick but fails on the bricks that have been online > (May be because the file existed on the bricks that are good copies) > 3) Now when self-heal is triggered from the Client/self-heal-daemon it > thinks the just replaced brick is the correct directory and deletes the > files/directories from the bricks that have the actual data. > > I have been working on afr for almost 4 years now and never saw any user > complain about this problem. We were working on a document for an official > way to replace brick/disk but it never occurred to us that this could > happen until recently. I am going to get a proper document by end of this > week on replacing the bricks/disks in a safe way. And will keep you guys > posted about fixes to prevent this from happening entirely. > > Pranith > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users