Gluster 3.2.1 : Mounted volumes "vanishes" on client side

pranithk at gluster.com (Pranith Kumar K) · Wed, 31 Aug 2011 08:35:38 +0530



hi,
     This can happen if there is a split-brain on that directory, could 
you post the output of "getfattr -d -m . /data/vmail/var/vmail" on all 
the bricks so that we can confirm if that is the case.

Pranith.
On 08/31/2011 01:59 AM, gluster1206 at akxnet.de wrote:
> Hi!
>
> I am using Gluster 3.2.1 on a two/three Opensuse 11.3/11.4 server
> cluster, where the Gluster nodes are server and client.
>
> While merging the cluster to servers with higher performance, I tried
> Gluster 3.3 beta.
>
> Both versions show the same problem:
>
> A single volume (holding the mail base, being accessed by POP3, IMAP and
> SMTP server) reports short time after mounting an "Input/Ouput error"
> and becomes unaccessible. The same volume on another idle server mounted
> still works.
>
> ls /var/vmail
> ls: cannot access /var/vmail: Input/output error
>
> lsof /var/vmail
> lsof: WARNING: can't stat() fuse.glusterfs file system /var/vmail
>        Output information may be incomplete.
> lsof: status error on /var/vmail: Input/output error
>
> After unmounting and remounting the volume, the same thing happens.
>
> I tried to recreate the volume, but this does not help.
>
> Although just created, the log is full of "self healing" entries (but
> they should not cause the volume to disappear, right?).
>
> I tried it with initially three bricks (and had to remove one) and the
> following parameters
>
> Volume Name: vmail
> Type: Replicate
> Status: Started
> Number of Bricks: 3
> Transport-type: tcp
> Bricks:
> Brick1: mx00.akxnet.de:/data/vmail
> Brick2: mx02.akxnet.de:/data/vmail
> Brick3: mx01.akxnet.de:/data/vmail
> Options Reconfigured:
> network.ping-timeout: 15
> performance.write-behind-window-size: 2097152
> auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
> performance.io-thread-count: 64
> performance.io-cache: on
> performance.stat-prefetch: on
> performance.quick-read: off
> nfs.disable: on
> performance.cache-size: 32MB and 64 MB
>
> and after the delete/create with two bricks and the following parameters
>
> Volume Name: vmail
> Type: Replicate
> Status: Started
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: mx02.akxnet.de:/data/vmail
> Brick2: mx01.akxnet.de:/data/vmail
> Options Reconfigured:
> performance.quick-read: off
> nfs.disable: on
> auth.allow: xx.xx.xx.xx,yy.yy.yy.yy,zz.zz.zz.zz,127.0.0.1
>
> But always the same result.
>
> The log entries
>
> [2011-08-30 22:10:45.376568] I
> [afr-self-heal-common.c:1557:afr_self_heal_completion_cbk]
> 0-vmail-replicate-0: background  data data self-heal completed on
> /xxxxx.de/yyyyyyyyyy/.Tauchen/courierimapuiddb
> [2011-08-30 22:10:45.385541] I [afr-common.c:801:afr_lookup_done]
> 0-vmail-replicate-0: background  meta-data self-heal triggered. path:
> /xxxxx.de/yyyyyyyyy/.Tauchen/courierimapkeywords
>
> The volume is presently unuseable. Any hint?
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users