Re: folder not being healed

Krutika Dhananjay <kdhananj@xxxxxxxxxx> · Tue, 5 Jan 2016 23:08:41 -0500 (EST)

Andreas,

Gluster doesn't permit applications to set any extended attribute which starts with trusted.afr.* among other patterns.
It is not clear how trusted.afr.remote1/2 extended attributes are appearing in the getfattr output you shared.
Were these directly set from the backend (by backend, I mean the bricks) by any chance?

-Krutika
From: "Andreas Tsaridas" <andreas.tsaridas@xxxxxxxxx>
To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>
Cc: "Krutika Dhananjay" <kdhananj@xxxxxxxxxx>, gluster-users@xxxxxxxxxxx
Sent: Tuesday, January 5, 2016 12:27:41 AM
Subject: Re:  folder not being healed

Hi,
I don't understand the question. Should I sent you some kind of configuration ?

ps: tried looking for you on IRC

Thanks

On Mon, Jan 4, 2016 at 5:20 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:

On 01/04/2016 09:14 PM, Andreas Tsaridas wrote:
Hello,
Unfortunately I get :

-bash: /usr/bin/getfattr: Argument list too long

There are a lot of file in these directories and even ls takes a long time to show results.
Kritika pointed out something important to me on IRC, Why does the volume have two sets of trusted.afr.* xattrs? i.e. trusted.afr.remote1/2 and trusted.afr.share-client-0/1.

 Pranith

How would I be able to keep the copy from web01 and discard the other ?

Thanks

On Mon, Jan 4, 2016 at 3:59 PM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
hi Andreas,
         The directory is in split-brain. Do you have any files/directories, that are in split-brain in the directory 'media/ga/live/a' ??

 Could you give output of
 "getfattr -d -m. -e hex media/ga/live/a/*" on both the bricks?

 Pranith

On 01/04/2016 05:21 PM, Andreas Tsaridas wrote:
Hello,
Please see below :
-----

web01 # getfattr -d -m . -e hex media/ga/live/a
# file: media/ga/live/a
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.remote1=0x000000000000000000000000
trusted.afr.remote2=0x000000000000000000000005
trusted.afr.share-client-0=0x000000000000000000000000
trusted.afr.share-client-1=0x0000000000000000000000ee
trusted.gfid=0xb13199a1464c44918464444b3f7eeee3
trusted.glusterfs.dht=0x000000010000000000000000ffffffff 

------

web02 # getfattr -d -m . -e hex media/ga/live/a
# file: media/ga/live/a
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.remote1=0x000000000000000000000008
trusted.afr.remote2=0x000000000000000000000000
trusted.afr.share-client-0=0x000000000000000000000000
trusted.afr.share-client-1=0x000000000000000000000000
trusted.gfid=0xb13199a1464c44918464444b3f7eeee3
trusted.glusterfs.dht=0x000000010000000000000000ffffffff

------

Regards,
AT

On Mon, Jan 4, 2016 at 12:44 PM, Krutika Dhananjay <kdhananj@xxxxxxxxxx> wrote:
Hi,

Could you share the output of
# getfattr -d -m . -e hex <abs-path-to-media/ga/live/a>

from both the bricks?

-Krutika
From: "Andreas Tsaridas" <andreas.tsaridas@xxxxxxxxx>
 To: gluster-users@xxxxxxxxxxx
 Sent: Monday, January 4, 2016 5:10:58 PM
 Subject:  folder not being healed

Hello,
I have a cluster of two replicated nodes in glusterfs 3.6.3 in RedHat 6.6. Problem is that a specific folder is always trying to be healed but never gets healed. This has been going on for 2 weeks now.

-----

# gluster volume status
Status of volume: share
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 172.16.4.1:/srv/share/glusterfs 49152 Y 10416
Brick 172.16.4.2:/srv/share/glusterfs 49152 Y 19907
NFS Server on localhost 2049 Y 22664
Self-heal Daemon on localhost N/A Y 22676
NFS Server on 172.16.4.2 2049 Y 19923
Self-heal Daemon on 172.16.4.2 N/A Y 19937

Task Status of Volume share
------------------------------------------------------------------------------
There are no active volume tasks

------

# gluster volume info

Volume Name: share
Type: Replicate
Volume ID: 17224664-645c-48b7-bc3a-b8fc84c6ab30
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 172.16.4.1:/srv/share/glusterfs
Brick2: 172.16.4.2:/srv/share/glusterfs
Options Reconfigured:
cluster.background-self-heal-count: 20
cluster.heal-timeout: 2
performance.normal-prio-threads: 64
performance.high-prio-threads: 64
performance.least-prio-threads: 64
performance.low-prio-threads: 64
performance.flush-behind: off
performance.io-thread-count: 64

------

# gluster volume heal share info
Brick web01.rsdc:/srv/share/glusterfs/
/media/ga/live/a - Possibly undergoing heal

Number of entries: 1

Brick web02.rsdc:/srv/share/glusterfs/
Number of entries: 0

-------

# gluster volume heal share info split-brain
Gathering list of split brain entries on volume share has been successful

Brick 172.16.4.1:/srv/share/glusterfs
Number of entries: 0

Brick 172.16.4.2:/srv/share/glusterfs
Number of entries: 0

-------

==> /var/log/glusterfs/glustershd.log <==
[2016-01-04 11:35:33.004831] I [afr-self-heal-entry.c:554:afr_selfheal_entry_do] 0-share-replicate-0: performing entry selfheal on b13199a1-464c-4491-8464-444b3f7eeee3
[2016-01-04 11:36:07.449192] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-share-client-1: remote operation failed: No data available. Path: (null) (00000000-0000-0000-0000-000000000000)
[2016-01-04 11:36:07.449706] W [client-rpc-fops.c:240:client3_3_mknod_cbk] 0-share-client-1: remote operation failed: File exists. Path: (null)

Could you please advise ?

Kind regards,

AT

_______________________________________________
 Gluster-users mailing list
 Gluster-users@xxxxxxxxxxx
 http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users