Re: Possible stale .glusterfs/indices/xattrop file?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 07/31/2017 02:33 PM, mabi wrote:
Now I understand what you mean the the "-samefile" parameter of "find". As requested I have now run the following command on all 3 nodes with the ouput of all 3 nodes below:

sudo find /data/myvolume/brick -samefile /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 -ls

node1:
8404683    0 lrwxrwxrwx   1 root     root           66 Jul 27 15:43 /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 -> ../../fe/c0/fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810/OC_DEFAULT_MODULE

node2:
8394638    0 lrwxrwxrwx   1 root     root           66 Jul 27 15:43 /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 -> ../../fe/c0/fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810/OC_DEFAULT_MODULE



arbiternode:
find: '/data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397': No such file or directory


Right, so the file OC_DEFAULT_MODULE is missing in this brick It's parent directory has gfid fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810.
Goal is to do a stat of this file from the fuse mount. If you know the complete path to this file, good. Otherwise you can use this script [1] to find the path to the parent dir corresponding to the gfid fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810 like so:
`./gfid-to-dirname.sh  /data/myvolume/brick fec0e4f4-38d2-4e2e-b5db-fdc0b9b54810`

[1] https://github.com/gluster/glusterfs/blob/master/extras/gfid-to-dirname.sh

Try to stat the file from a new (temporary) fuse mount to avoid any caching effects.
-Ravi


Hope that helps.

-------- Original Message --------
Subject: Re: Possible stale .glusterfs/indices/xattrop file?
Local Time: July 31, 2017 10:55 AM
UTC Time: July 31, 2017 8:55 AM




On 07/31/2017 02:00 PM, mabi wrote:
To quickly resume my current situation:

on node2 I have found the following file xattrop/indices file which matches the GFID of the "heal info" command (below is there output of "ls -lai":

2798404 ---------- 2 root root 0 Apr 28 22:51 /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397



As you can see this file has inode number 2798404, so I ran the following command on all my nodes (node1, node2 and arbiternode):


...which is what I was saying is incorrect. 2798404 is an XFS inode number and is not common to the same file across nodes. So you will get different results. Use the -samefile flag I shared earlier.
-Ravi



sudo find /data/myvolume/brick -inum 2798404 -ls

Here below are the results for all 3 nodes:

node1:

2798404   19 -rw-r--r--   2 www-data www-data       32 Jun 19 17:42 /data/myvolume/brick/.glusterfs/e6/5b/e65b77e2-a4c4-4824-a7bb-58df969ce4b0
2798404   19 -rw-r--r--   2 www-data www-data       32 Jun 19 17:42 /data/myvolume/brick/<REMOVED_DIRECTORIES_IN_BETWEEN>/fileKey

node2:

2798404    1 ----------   2 root     root            0 Apr 28 22:51 /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397
2798404    1 ----------   2 root     root            0 Apr 28 22:51 /data/myvolume/brick/.glusterfs/indices/xattrop/xattrop-6fa49ad5-71dd-4ec2-9246-7b302ab92d38

arbirternode:

NOTHING

As you requested I have tried to run on node1 a getfattr on the fileKey file by using the following command:

getfattr -m . -d -e hex fileKey

but there is no output. I am not familiar with the getfattr command so maybe I am using the wrong parameters, could you help me with that?


-------- Original Message --------
Subject: Re: Possible stale .glusterfs/indices/xattrop file?
Local Time: July 31, 2017 9:25 AM
UTC Time: July 31, 2017 7:25 AM

On 07/31/2017 12:20 PM, mabi wrote:

I did a find on this inode number and I could find the file but only on node1 (nothing on node2 and the new arbiternode). Here is an ls -lai of the file itself on node1:
Sorry I don't understand, isn't that (XFS) inode number specific to node2's brick? If you want to use the same command, maybe you should try `find /data/myvolume/brick -samefile /data/myvolume/brick/.glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397` on all 3 bricks.


-rw-r--r-- 1 www-data www-data   32 Jun 19 17:42 fileKey

As you can see it is a 32 bytes file and as you suggested I ran a "stat" on this very same file through a glusterfs mount (using fuse) but unfortunately nothing happened. The GFID is still being displayed to be healed.  Just in case here is the output of the stat:

  File: ‘fileKey’
  Size: 32        Blocks: 1          IO Block: 131072 regular file
Device: 1eh/30d Inode: 12086351742306673840  Links: 1
Access: (0644/-rw-r--r--)  Uid: (   33/www-data)   Gid: (   33/www-data)
Access: 2017-06-19 17:42:35.339773495 +0200
Modify: 2017-06-19 17:42:35.343773437 +0200
Change: 2017-06-19 17:42:35.343773437 +0200
Birth: -

Is this 'fileKey' on node1 having the same gfid (see getfattr output)? Looks like it is missing the hardlink inside .glusterfs folder since the link count is only 1.
Thanks,
Ravi

What else can I do or try in order to fix this situation?




-------- Original Message --------
Subject: Re: Possible stale .glusterfs/indices/xattrop file?
Local Time: July 31, 2017 3:27 AM
UTC Time: July 31, 2017 1:27 AM




On 07/30/2017 02:24 PM, mabi wrote:
Hi Ravi,

Thanks for your hints. Below you will find the answer to your questions.

First I tried to start the healing process by running:

gluster volume heal myvolume

and then as you suggested watch the output of the glustershd.log file but nothing appeared in that log file after running the above command. I checked the files which need to be healing using the "heal <volume> info" command and it still shows that very same GFID on node2 to be healed. So nothing changed here.

The file /data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397 is only on node2 and not on my nod1 nor on my arbiternode. This file seems to be a regular file and not a symlink. Here is the output of the stat command on it from my node2:

  File: ‘/data/myvolume/brick/.glusterfs/indices/xattrop/29e0d13e-1217-41cc-9bda-1fbbf781c397’
  Size: 0         Blocks: 1          IO Block: 512    regular empty file
Device: 25h/37d Inode: 2798404     Links: 2

Okay, link count of 2 means there is a hardlink somewhere on the brick. Try the find command again. I see that the inode number is 2798404, not the one you shared in your first mail. Once you find the path to the file, do a stat of the file from mount. This should create the entry in the other 2 bricks and do the heal. But FWIW, this seems to be a zero byte file.
 
Regards,
Ravi

Access: (0000/----------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-04-28 22:51:15.215775269 +0200
Modify: 2017-04-28 22:51:15.215775269 +0200
Change: 2017-07-30 08:39:03.700872312 +0200
Birth: -

I hope this is enough info for a starter, else let me know if you need any more info. I would be glad to resolve this weird file which needs to be healed but can not.

Best regards,
Mabi



-------- Original Message --------
Subject: Re: Possible stale .glusterfs/indices/xattrop file?
Local Time: July 30, 2017 3:31 AM
UTC Time: July 30, 2017 1:31 AM




On 07/29/2017 04:36 PM, mabi wrote:
Hi,

Sorry for mailing again but as mentioned in my previous mail, I have added an arbiter node to my replica 2 volume and it seem to have gone fine except for the fact that there is one single file which needs healing and does not get healed as you can see here from the output of a "heal info":

Brick node1.domain.tld:/data/myvolume/brick
Status: Connected
Number of entries: 0

Brick node2.domain.tld:/data/myvolume/brick
<gfid:29e0d13e-1217-41cc-9bda-1fbbf781c397>
Status: Connected
Number of entries: 1

Brick arbiternode.domain.tld:/srv/glusterfs/myvolume/brick
Status: Connected
Number of entries: 0

On my node2 the respective .glusterfs/indices/xattrop directory contains two files as you can see below:

ls -lai  /data/myvolume/brick/.glusterfs/indices/xattrop
total 76180
     10 drw------- 2 root root 4 Jul 29 12:15 .
      9 drw------- 5 root root 5 Apr 28 22:15 ..
2798404 ---------- 2 root root 0 Apr 28 22:51 29e0d13e-1217-41cc-9bda-1fbbf781c397
2798404 ---------- 2 root root 0 Apr 28 22:51 xattrop-6fa49ad5-71dd-4ec2-9246-7b302ab92d38



I tried to find the real file on my brick where this xattrop file points to using its inode number (command: find /data/myvolume/brick/data -inum 8394642) but it does not find any associated file.

So my question here is, is it possible that this is a stale file which just forgot to get deleted from the indices/xattrop file by gluster for some unknown reason? If yes is it safe for me to delete these two files? or what would be the correct process in that case?
The 'xattrop-6fa...' is the base entry. gfids of files that need heal are hard linked to this entry, so nothing needs to be done for it. But you need to find out why '29e0d13...' is not healing. Launch the heal and observe the glustershd logs for errors. I suppose the inode number for .glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 is what is 8394642.  Is .glusterfs/29/e0/29e0d13e-1217-41cc-9bda-1fbbf781c397 a regular file or  symlink? Does it exist in the other 2 bricks? What is the link count (as seen from stat <file>)?
-Ravi


Thank you for your input.
Mabi


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users








_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux