Hi David. Here's an example of the getfattr from my server: fs17:/var/tmp/hptools# getfattr -d -e hex -m trusted.afr /export/read-only/g01 getfattr: Removing leading '/' from absolute path names # file: export/read-only/g01 trusted.afr.pfs-ro1-client-0=0x000000000000000000000000 trusted.afr.pfs-ro1-client-1=0x000000000000000100000000 The hex value is the same for all 10 of my directories. James Burnash, Unix Engineering -----Original Message----- From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of David Lloyd Sent: Wednesday, January 26, 2011 12:24 PM To: gluster-users at gluster.org Subject: Re: self heal errors on 3.1.1 clients I read on another thread about checking the getfattr output for each brick, but it tailed off before any explanation of what to do with this information We have 8 bricks in the volume. Config is: g1:~ # gluster volume info glustervol1 Volume Name: glustervol1 Type: Distributed-Replicate Status: Started Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: g1:/mnt/glus1 Brick2: g2:/mnt/glus1 Brick3: g3:/mnt/glus1 Brick4: g4:/mnt/glus1 Brick5: g1:/mnt/glus2 Brick6: g2:/mnt/glus2 Brick7: g3:/mnt/glus2 Brick8: g4:/mnt/glus2 Options Reconfigured: performance.write-behind-window-size: 100mb performance.cache-size: 512mb performance.stat-prefetch: on and the getfattr outputs are: g1:~ # getfattr -d -e hex -m trusted.afr /mnt/glus1 getfattr: Removing leading '/' from absolute path names # file: mnt/glus1 trusted.afr.glustervol1-client-0=0x000000000000000000000000 trusted.afr.glustervol1-client-1=0x000000000000000000000000 g1:~ # getfattr -d -e hex -m trusted.afr /mnt/glus2 getfattr: Removing leading '/' from absolute path names # file: mnt/glus2 trusted.afr.glustervol1-client-4=0x000000000000000000000000 trusted.afr.glustervol1-client-5=0x000000000000000000000000 g2:~ # getfattr -d -e hex -m trusted.afr /mnt/glus1 getfattr: Removing leading '/' from absolute path names # file: mnt/glus1 trusted.afr.glustervol1-client-0=0x000000000000000000000000 trusted.afr.glustervol1-client-1=0x000000000000000000000000 g2:~ # getfattr -d -e hex -m trusted.afr /mnt/glus2 getfattr: Removing leading '/' from absolute path names # file: mnt/glus2 trusted.afr.glustervol1-client-4=0x000000000000000000000000 trusted.afr.glustervol1-client-5=0x000000000000000000000000 g3:~ # getfattr -d -e hex -m trusted.afr /mnt/glus1 getfattr: Removing leading '/' from absolute path names # file: mnt/glus1 trusted.afr.glustervol1-client-2=0x000000000000000000000000 trusted.afr.glustervol1-client-3=0x000000000000000100000000 g3:~ # getfattr -d -e hex -m trusted.afr /mnt/glus2 getfattr: Removing leading '/' from absolute path names # file: mnt/glus2 trusted.afr.glustervol1-client-6=0x000000000000000000000000 trusted.afr.glustervol1-client-7=0x000000000000000000000000 g4:~ # getfattr -d -e hex -m trusted.afr /mnt/glus1 getfattr: Removing leading '/' from absolute path names # file: mnt/glus1 trusted.afr.glustervol1-client-2=0x000000000000000100000000 trusted.afr.glustervol1-client-3=0x000000000000000000000000 g4:~ # getfattr -d -e hex -m trusted.afr /mnt/glus2 getfattr: Removing leading '/' from absolute path names # file: mnt/glus2 trusted.afr.glustervol1-client-6=0x000000000000000000000000 trusted.afr.glustervol1-client-7=0x000000000000000000000000 Hope someone can help. Things still seem to be working, but slowed down. Cheers David On 26 January 2011 17:07, David Lloyd <david.lloyd at v-consultants.co.uk>wrote: > We started getting the same problem at almost exactly the same time. > > get one of these messages every time I access the root of the mounted > volume (and nowhere else, I think). > This is also 3.1.1 > > I'm just starting to look in to it, I'll let you know if I get anywhere. > > David > > On 26 January 2011 16:38, Burnash, James <jburnash at knight.com> wrote: > >> These errors are appearing in the file >> /var/log/glusterfs/<mountpoint>.log >> >> [2011-01-26 11:02:10.342349] I [afr-common.c:672:afr_lookup_done] >> pfs-ro1-replicate-5: split brain detected during lookup of /. >> [2011-01-26 11:02:10.342366] I [afr-common.c:716:afr_lookup_done] >> pfs-ro1-replicate-5: background meta-data data self-heal triggered. >> path: / >> [2011-01-26 11:02:10.342502] E >> [afr-self-heal-metadata.c:524:afr_sh_metadata_fix] pfs-ro1-replicate-2: >> Unable to self-heal permissions/ownership of '/' (possible split-brain). >> Please fix the file on all backend volumes >> >> Apparently the issue is the root of the storage pool, which in my >> case on the backend storage servers is this path: >> >> /export/read-only - permissions are: drwxr-xr-x 12 root root >> 4096 Dec 28 12:09 /export/read-only/ >> >> Installation is GlusterFS 3.1.1 on servers and clients, servers >> running CentOS 5.5, clients running CentOS 5.2. >> >> The volume info header is below: >> >> Volume Name: pfs-ro1 >> Type: Distributed-Replicate >> Status: Started >> Number of Bricks: 10 x 2 = 20 >> Transport-type: tcp >> >> Any ideas? I don't see a permission issue on the directory or it's >> subs themselves. >> >> James Burnash, Unix Engineering >> >> >> DISCLAIMER: This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com