Sorry - I just discovered that I appended the wrong log lines to my original post - here are the correct ones, for completeness: [2011-06-29 10:21:32.959956] E [rpc-clnt.c:199:call_bail] 0-pfs-rw1-client-2: bailing out frame type(GlusterFS 3.1) op(ENTRYLK(31)) xid = 0x61874x sent = 2011-06-29 09:51:31.447474. timeout = 1800 [2011-06-29 10:51:34.781215] E [rpc-clnt.c:199:call_bail] 0-pfs-rw1-client-3: bailing out frame type(GlusterFS 3.1) op(ENTRYLK(31)) xid = 0x62358x sent = 2011-06-29 10:21:32.960048. timeout = 1800 James Burnash Unix Engineer Knight Capital Group From: Anand Avati [mailto:anand.avati at gmail.com] Sent: Wednesday, June 29, 2011 11:17 AM To: Burnash, James Cc: gluster-users at gluster.org Subject: Re: Possible new bug in 3.1.5 discovered Compatibility was broken between 3.1.4 (and pre) servers and 3.1.5 clients (results in a hang when replicate translator is used). This compat breakage was "necessary" in order to fix a hang issue which was present in all 3.1.x till then. New servers should work fine with old clients. Upgrade all your servers before upgrading the clients. Avati On Wed, Jun 29, 2011 at 8:23 PM, Burnash, James <jburnash at knight.com<mailto:jburnash at knight.com>> wrote: I'm sorry - I think I wasn't clear. The problem is that a 3.1.5 client used to write a file to GlusterFS native mount point on a server running 3.1.3 hangs. Are you saying that the clients are known to not be backward compatible within the 3.1.x series? James Burnash Unix Engineer Knight Capital Group From: Anand Avati [mailto:anand.avati at gmail.com<mailto:anand.avati at gmail.com>] Sent: Wednesday, June 29, 2011 10:46 AM To: Burnash, James Cc: gluster-users at gluster.org<mailto:gluster-users at gluster.org> Subject: Re: Possible new bug in 3.1.5 discovered James, Both in 3.1.5 and 3.2.1 there were necessary locks hang fixes which went in and as a side effect clients and servers result in a hang when used across versions. Please upgrade your clients to 3.1.5 as well. This is a known, and hard to fix compatibility issue. Avati On Wed, Jun 29, 2011 at 8:05 PM, Burnash, James <jburnash at knight.com<mailto:jburnash at knight.com>> wrote: "May you live in interesting times" Is this a curse or a blessing? :) I've just tested a 3.1.5 GlusterFS native client against a 3.1.3 storage pool using this volume: Volume Name: pfs-rw1 Type: Distributed-Replicate Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: jc1letgfs16-pfs1:/export/read-write/g01 Brick2: jc1letgfs13-pfs1:/export/read-write/g01 Brick3: jc1letgfs16-pfs1:/export/read-write/g02 Brick4: jc1letgfs13-pfs1:/export/read-write/g02 Options Reconfigured: performance.cache-size: 2GB performance.stat-prefetch: 0 network.ping-timeout: 10 diagnostics.client-log-level: ERROR Any attempt to write to that volume mounted on a native client using version 3.1.5 results in a hang at the command line, which I can only break out of by killing my ssh session into the client. Upon logging back into the same client, I see a zombie process from the attempt to write: 21172 ? D 0:00 touch /pfs1/test/junk1 Anybody else run into this situation? Client mount log (/var/log/glusterfs/pfs2.log) below: [2011-06-29 10:28:07.860519] E [afr-self-heal-metadata.c:522:afr_sh_metadata_fix] 0-pfs-ro1-replicate-6: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes [2011-06-29 10:28:07.860668] E [afr-self-heal-metadata.c:522:afr_sh_metadata_fix] 0-pfs-ro1-replicate-1: Unable to self-heal permissions/ownership of '/' (possible split-brain). Please fix the file on all backend volumes s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes ns/ownership of '/' (possible split-brain). Please fix the file on all backend volumes data self-heal failed on / data self-heal failed on / s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes ns/ownership of '/' (possible split-brain). Please fix the file on all backend volumes data self-heal failed on / data self-heal failed on / data self-heal failed on / s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes data self-heal failed on / data self-heal failed on / data self-heal failed on / s/ownership of '/' (possible split-brain). Please fix the file on all backend volumes data self-heal failed on / data self-heal failed on / data self-heal failed on / data self-heal failed on / James Burnash Unix Engineer Knight Capital Group DISCLAIMER: This e-mail, and any attachments thereto, is intended only for use by the addressee(s)named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com<http://www.knight.com/> _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users DISCLAIMER: This e-mail, and any attachments thereto, is intended only for use by the addressee(s)named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com<http://www.knight.com/> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://gluster.org/pipermail/gluster-users/attachments/20110629/e64b8a3f/attachment-0001.htm>