On 12/03/2014 12:09 PM, Krutika Dhananjay wrote: > > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > *From: *"Krutika Dhananjay" <kdhananj@xxxxxxxxxx> > *To: *"Emmanuel Dreyfus" <manu@xxxxxxxxxx> > *Cc: *"Gluster Devel" <gluster-devel@xxxxxxxxxxx> > *Sent: *Wednesday, December 3, 2014 11:54:03 AM > *Subject: *Re: question on glustershd > > > > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > *From: *"Emmanuel Dreyfus" <manu@xxxxxxxxxx> > *To: *"Ravishankar N" <ravishankar@xxxxxxxxxx>, "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > *Sent: *Wednesday, December 3, 2014 10:14:22 AM > *Subject: *Re: question on glustershd > > Ravishankar N <ravishankar@xxxxxxxxxx> wrote: > > > afr_shd_full_healer() is run only when we run 'gluster vol heal <volname> > > full`, doing a full brick traversal (readdirp) from the root and > > attempting heal for each entry. > > Then we agree that "gluster vol heal $volume full" may fail to heal some > files because of inode lock contention, right? > > If that is expected behavior, then the tests are wrong. For instance in > tests/basic/afr/entry-self-heal.t we do "gluster vol heal $volume full" > and we check that no unhealed files are left behind. > > Did I miss something, or do we have to either fix afr_shd_full_healer() > or tests/basic/afr/entry-self-heal.t ? > > > Typical use of "heal full" is in the event of a disk replacement where one of the bricks in the replica set is totally empty. > And in a volume where both (assuming 2 way replication to keep the discussion simple) children of AFR are on the same node, SHD would launch two healers. > Each healer does readdirp() only on the brick associated with it (see how @subvol is initialised in afr_shd_full_sweep()). > I guess in such scenarios, the healer associated with the brick that was empty would have no entries to read, and as a result, nothing to heal from it to the other brick. > In that case, there is no question of lock contention of the kind that you explained above? > > Come to think of it, it does not really matter whether the two bricks are on the same node or not. > In either case, there may not be a lock contention between healers associated with different bricks, irrespective of whether they are part of the same SHD or SHDs on different nodes. > -Krutika > Actually, there is a bug with full heal in afr-v2. When full heal is triggered, glusterd sends the heal op to only one shd of the replica pair:the one whose node has the highest uuid. And that shd triggers heal on the bricks only that are local to it. So in a 1x2 volume where the bricks are on different nodes, only one shd gets the op and it triggers readdirp + heal on its local client (brick) only. (See BZ 1112158) In afr-v1, also, only one shd receives the heal full op, but the readdirp is done at the afr-level (as opposed to the client xlator level in v2), doing a conservative merge. > -Krutika > > -- > Emmanuel Dreyfus > http://hcpnet.free.fr/pubz > manu@xxxxxxxxxx > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://supercolony.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel