Re: Questions about gluster reblance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/10/2014 03:27 AM, Paul Guo wrote:
Hello,

Recently I spent a bit time understanding rebalance since I want to know its
performance given that there could be more and more bricks to be added into
my glusterfs volume and there will be more and more files and directories
in the existing glusterfs volume. During the test I saw something which I'm
really confused about.

Steps:

SW versions: glusterfs 3.4.4 + centos 6.5
Inital Configuration: replica 2, lab1:/brick1 + lab2:/brick1

fuse_mount it on /mnt
cp -rf /sbin /mnt (~300+ files under /sbin)
add two more bricks: lab1:/brick2 + lab2:/brick2.
run gluster reblance.

1) fix-layout only (e.g. gluster volume rebalance g1 fix-layout start)‍

After rebalance is done (observed via "gluster volume rebalance g1
status"),‍
I found there is no file under lab1:/brick2/sbin. The hash ranges of
new brick‍lab1:/brick2/sbin and old brick lab1:/brick1/sbin appear to
be ok.

[root@lab1 Desktop]# getfattr -dm. -e hex /brick2/sbin
getfattr: Removing leading '/' from absolute path names
# file: brick2/sbin
trusted.gfid=0x35976c2034d24dc2b0639fde18de007d
trusted.glusterfs.dht=0x00000001000000007fffffffffffffff

[root@lab1 Desktop]# getfattr -dm. -e hex /brick1/sbin
getfattr: Removing leading '/' from absolute path names
# file: brick1/sbin
trusted.gfid=0x35976c2034d24dc2b0639fde18de007d
trusted.glusterfs.dht=0x0000000100000000000000007ffffffe
‍
The question is: AFAIK, fix-layout would create "linkto" files
(files with "linkto" xattr and with sticky bit set only)
for those ones whose hash values belong
to the new subvol. so there should have been some "linkto" files
under lab1:/brick2, but no one now, why?

fix-layout only fixes the layout, i.e spreads the layout to the newer bricks (or bricks previously not participating in the layout). It would not create the linkto files.

Post fix-layout, if one were to perform a lookup on a file, that should have belonged to the newer brick as per the layout and hash of that file name, one can see the linkto file being present.

Hope this explains (1).


2) fix-layout + data_migrate (e.g. gluster volume rebalance g1 start)

After migration is done, I saw linkto files under brick2/sbin.‍
There are totally 300+ files under system /sbin. Under brick2/sbin,
I found the 300+ files are all there! either migrated or linkto-ed.

-rwxr-xr-x 2 root root   17400 Sep 10 12:02 vmcore-dmesg
---------T 2 root root       0 Sep 10 12:03 weak-modules
---------T 2 root root       0 Sep 10 12:03 wipefs
-rwxr-xr-x 2 root root  295656 Sep 10 12:02 xfsdump
-rwxr-xr-x 2 root root  510000 Sep 10 12:02 xfs_repair
-rwxr-xr-x 2 root root  348088 Sep 10 12:02 xfsrestore

And under brick1/sbin, those migrated files are gone as expected.
There are near to 150 files under brick/sbin.
‍
This confuses me since creating those linkto files seems to
be unnecessary, at least for files whose hash values do not belong
to the subvol. (My understanding is that if a file's hash value is
in the range of a subvol then it will be stored in that subvol.)

Can you check if a lookup of the file post rebalance clears up these _stale_ linkto files?

How did you compute the hash of these files and decide that they do not belong to the new brick (i.e brick2)? I did them on my end and you are right (based on the layout you presented above), but I am curious as to how you arrived at the same conclusion.

Rebalance could choose to not move files but just create the linkto files based on space usage between the source and target bricks etc. Not stating this is what happened here, but a possibility.


I quickly looked at the code. gf_defrag_start_crawl() appears to
be the function for this operation. I do see code that does file migration
from the code path, but debugging code shows that those "linkto" files
seem to be not created by gf_defrag_start_crawl(). I'm not that familar with
the code detail and the theory so I'm not sure who created those
"linkto" files and why the "linkto" file are created.

I am going to leave this part as, dht_linkfile_create does this and mostly would happen during lookup.

Shyam
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users





[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux