On 12/15/2014 11:10 AM, Krutika Dhananjay wrote: > Seems OK to me, as long as the appropriate locks are taken. > > -Krutika > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > *From: *"Emmanuel Dreyfus" <manu@xxxxxxxxxx> > *To: *"Gluster Devel" <gluster-devel@xxxxxxxxxxx> > *Sent: *Saturday, December 13, 2014 8:08:03 PM > *Subject: * AFR conservative merge portability > > Hello > > On NetBSD, tests/basic/afr/entry-self-heal.t always fail on this > scenario: > > mkdir spb_heal > kill brick brick0 > touch spb_heal/0 > glusterfs volume start force > kill_brick brick1 > touch spb_heal/1 > glusterfs volume start force > > At that time, conservative merge takes off and copy spb_heal/0 and > spb_heal/1 in each brick where it is missing. That works, but on NetBSD > we are left with AFR xattr on spb_heal directory telling each brick > accuses the other for metadata. This metadata split brain that will not > self heal. > > This happens because after adding an entry, parent directory (spb_heal > here) mtime/ctime must be updated. On Linux, it seems the filesystem is > responsible for that. On NetBSD, the kernel filesyste-independant code > takes care of it and will send a SETATTR to update ctime/mtime on parent > directory. > > So when we touch spb_heal/0 and spb_heal/1, the NetBSD kernel sends a > SETATTR for spb_heal ctime/mtine, and since the other brick is down, > here is our metadata split brain. > > In http://review.gluster.org/9267, Krutika Dhananjay fixes the test by > clearing AFR xattr to remove the split brain state, but while it let the > test pass, it does not address the real world problem that will leave > metadata split brain that does not self heal. > > Here is a proposal: we know that at the end of conservative merge, we > should end up with the situation where directory ctime/mtime is the > ctime of the most recently added children. And fortunately, as > conservative merge happens, parent directory ctime/mtime are updated on > each child addition, and we finish in the desired state. > > In other words, after conservative merge, parent directory metadata > split brain for only ctime/mtime can just be cleared by AFR without any > harm. > Seems like adding a lot of code just to do this. AFR does data selfheal, metadata selfheal and entry selfheal for a gfid in that order. After (conservative) entry self-heal completes, we would again have to examine the looked-up iatts of the dir to see if the metadata SB is only due to c/mtime mismatch, and not due to uid/gid/permissions mismatch, and only if so, clear the metadata xattrs. The check can be done in metadata selfheal itself but I don't think AFR is to blame if the user space app in NetBSD sends a setattr on the parent dir. As far as AFR is concerned, it witnessed a pending metadata fop on the dir and recorded it in the xattrs.We could end up in this situation on Linux too if we kill bricks alternatively and just do a `touch /mount/existing_dir_name` Since the extra setattr comes from user space, we still have the directory listed in the output of `heal info` command and we can resolve it. -Ravi > Does it looks reasonable? Any opinion? > > -- > Emmanuel Dreyfus > http://hcpnet.free.fr/pubz > manu@xxxxxxxxxx > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel