Re: AFR Heal Bug

Gareth Bult <gareth@xxxxxxxxxxxxx> · Sun, 30 Dec 2007 20:33:22 +0000 (GMT)

Hi,

Many thanks, a fix would be great .. :)

I've been doing a little more testing and can confirm that AFR definitely does not honor "sparse" when healing.

This is particularly noticeable when using XEN images.

A typical XEN image might be 3G for example, with "du" reporting 600M used.
After "healing" the image to another brick, it shows 3G size, and du shows 3G used.

This makes a fair difference to my "images" volume (!)

[in addition to the problems when applied to stripes!]

Regards,
Gareth.

----- Original Message -----
From: "Krishna Srinivas" <krishna@xxxxxxxxxxxxx>
To: "Gareth Bult" <gareth@xxxxxxxxxxxxx>
Cc: "gluster-devel" <gluster-devel@xxxxxxxxxx>
Sent: Sunday, December 30, 2007 8:10:42 PM (GMT) Europe/London
Subject: Re: AFR Heal Bug

Hi Gareth,

Yes this bug was introduced recently after we did changes to the way
readdir() call worked in glusterfs, afr is calling readdir() only from the
first child (which is blank in your case) fix will be on its way in a couple
of days.

Thanks
Krishna

On Dec 31, 2007 12:39 AM, Gareth Bult <gareth@xxxxxxxxxxxxx> wrote:
> Ok, I'm going to call it a bug, tell me if I'm wrong .. :)
>
> (two servers, both define a "homes" volume)
>
> Client;
>
> volume nodea-homes
> type protocol/client
> option transport-type tcp/client
> option remote-host nodea
> option remote-subvolume homes
> end-volume
>
> volume nodeb-homes
> type protocol/client
> option transport-type tcp/client
> option remote-host nodeb
> option remote-subvolume homes
> end-volume
>
> volume homes-afr
> type cluster/afr
> subvolumes nodea-homes nodeb-homes ### ISSUE IS HERE! ###
> option scheduler rr
> end-volume
>
> Assume system is completely up-to-date and working Ok.
> Mount homes filesystem on "client".
> Kill the "nodea" server.
> System carries on, effectively using nodeb.
>
> Wipe nodea's physical volume.
> Restart nodea server.
>
> All of a sudden, "client" see's an empty "homes" filesystem, although data is still in place on "B" and "A" is blank.
> i.e. the client is seeing the blank "nodea" only (!)
>
> .. at this point you check nodeb to make sure your data really is there, then you can mop up the coffee you've just spat all over your screens ..
>
> If you crash nodeB instead, there appears to be no problem, and a self heal "find" will correct the blank volume.
> Alternatively, if you reverse the subvolumes as listed above, you don't see the problem.
>
> The issue appears to be blanking the first subvolume.
>
> I'm thinking the order of the volumes should not be an issue, gluster should know one volume is empty / new and one contains real data and act accordingly, rather than relying on the order volumes are listed .. (???)
>
> I'm using fuse glfs7 and gluster 1.3.8 (tla).
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxx
> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>