Compatibility was broken between 3.1.4 (and pre) servers and 3.1.5 clients (results in a hang when replicate translator is used). This compat breakage was "necessary" in order to fix a hang issue which was present in all 3.1.x till then. New servers should work fine with old clients. Upgrade all your servers before upgrading the clients. Avati On Wed, Jun 29, 2011 at 8:23 PM, Burnash, James <jburnash at knight.com> wrote: > I?m sorry ? I think I wasn?t clear.**** > > ** ** > > The problem is that a 3.1.5 client used to write a file to GlusterFS native > mount point on a server running 3.1.3 hangs.**** > > ** ** > > Are you saying that the clients are known to not be backward compatible > within the 3.1.x series?**** > > ** ** > > James Burnash**** > > Unix Engineer**** > > Knight Capital Group**** > > ** ** > > *From:* Anand Avati [mailto:anand.avati at gmail.com] > *Sent:* Wednesday, June 29, 2011 10:46 AM > *To:* Burnash, James > *Cc:* gluster-users at gluster.org > *Subject:* Re: Possible new bug in 3.1.5 discovered**** > > ** ** > > James,**** > > Both in 3.1.5 and 3.2.1 there were necessary locks hang fixes which went > in and as a side effect clients and servers result in a hang when used > across versions. Please upgrade your clients to 3.1.5 as well. This is a > known, and hard to fix compatibility issue.**** > > ** ** > > Avati**** > > On Wed, Jun 29, 2011 at 8:05 PM, Burnash, James <jburnash at knight.com> > wrote:**** > > ?May you live in interesting times?**** > > **** > > Is this a curse or a blessing? J**** > > **** > > I?ve just tested a 3.1.5 GlusterFS native client against a 3.1.3 storage > pool using this volume:**** > > **** > > Volume Name: pfs-rw1**** > > Type: Distributed-Replicate**** > > Status: Started**** > > Number of Bricks: 2 x 2 = 4**** > > Transport-type: tcp**** > > Bricks:**** > > Brick1: jc1letgfs16-pfs1:/export/read-write/g01**** > > Brick2: jc1letgfs13-pfs1:/export/read-write/g01**** > > Brick3: jc1letgfs16-pfs1:/export/read-write/g02**** > > Brick4: jc1letgfs13-pfs1:/export/read-write/g02**** > > Options Reconfigured:**** > > performance.cache-size: 2GB**** > > performance.stat-prefetch: 0**** > > network.ping-timeout: 10**** > > diagnostics.client-log-level: ERROR**** > > **** > > Any attempt to write to that volume mounted on a native client using > version 3.1.5 results in a hang at the command line, which I can only break > out of by killing my ssh session into the client. Upon logging back into the > same client, I see a zombie process from the attempt to write:**** > > **** > > 21172 ? D 0:00 touch /pfs1/test/junk1**** > > **** > > Anybody else run into this situation?**** > > **** > > Client mount log (/var/log/glusterfs/pfs2.log) below:**** > > **** > > [2011-06-29 10:28:07.860519] E > [afr-self-heal-metadata.c:522:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: > Unable to self-heal permissions/ownership of '/' (possible split-brain). > Please fix the file on all backend volumes**** > > [2011-06-29 10:28:07.860668] E > [afr-self-heal-metadata.c:522:afr_sh_metadata_fix] 0-pfs-ro1-replicate-1: > Unable to self-heal permissions/ownership of '/' (possible split-brain). > Please fix the file on all backend volumes**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > ns/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > ns/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > s/ownership of '/' (possible split-brain). Please fix the file on all > backend volumes**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > data self-heal failed on /**** > > **** > > James Burnash**** > > Unix Engineer**** > > Knight Capital Group**** > > **** > > ** ** > > DISCLAIMER: > This e-mail, and any attachments thereto, is intended only for use by the > addressee(s)named herein and > may contain legally privileged and/or confidential information. If you are > not the intended recipient of this > e-mail, you are hereby notified that any dissemination, distribution or > copying of this e-mail and any attachments > thereto, is strictly prohibited. If you have received this in error, please > immediately notify me and permanently > delete the original and any printout thereof. E-mail transmission cannot > be guaranteed to be secure or error-free. > The sender therefore does not accept liability for any errors or omissions > in the contents of this message which > arise as a result of e-mail transmission. > NOTICE REGARDING PRIVACY AND CONFIDENTIALITY > Knight Capital Group may, at its discretion, monitor and review the content > of all e-mail communications.**** > > http://www.knight.com**** > > **** > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users**** > > ** ** > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20110629/c52f2252/attachment.htm>