Re: gluster source code help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ravi,
Thanks .  I have created a simple 2 node replica volume.

root@dhcp-192-168-36-220:/home/user/gluster/rep-brick1#  gluster v info rep-vol
 
Volume Name: rep-vol
Type: Replicate
Volume ID: c9c9ef39-27e5-44d5-be69-82423c743304
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.36.220:/home/user/gluster/rep-brick1
Brick2: 192.168.36.220:/home/user/gluster/rep-brick2
Options Reconfigured:
features.inode-quota: off
features.quota: off
performance.readdir-ahead: on

 Killed brick1 process.

root@dhcp-192-168-36-220:/home/user/gluster/rep-brick1# gluster v status rep-vol
Status of volume: rep-vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.36.220:/home/user/gluster/rep
-brick1                                     N/A       N/A        N       N/A  
Brick 192.168.36.220:/home/user/gluster/rep
-brick2                                     49211     0          Y       20157
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       20186
 
Task Status of Volume rep-vol
------------------------------------------------------------------------------
There are no active volume task

And copying wish.txt  to mount  directory.

From brick2 ,

root@dhcp-192-168-36-220:/home/user/gluster/rep-brick2/.glusterfs# getfattr -d -e hex -m . ../wish.txt 
# file: ../wish.txt
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.rep-vol-client-0=0x000000020000000100000000
trusted.bit-rot.version=0x0200000000000000589ab1410003e910
trusted.gfid=0xe9f3aafb3f844bca8922a00d48abc643


root@dhcp-192-168-36-220:/home/user/gluster/rep-brick2/.glusterfs/indices/xattrop# ll
total 8
drw------- 2 root root 4096 Feb  8 13:50 ./
drw------- 4 root root 4096 Feb  8 13:48 ../
---------- 4 root root    0 Feb  8 13:50 00000000-0000-0000-0000-000000000001
---------- 4 root root    0 Feb  8 13:50 00000000-0000-0000-0000-000000000005
---------- 4 root root    0 Feb  8 13:50 e9f3aafb-3f84-4bca-8922-a00d48abc643
---------- 4 root root    0 Feb  8 13:50 xattrop-b3beb437-cea4-46eb-9eb4-8d83bfa7baa1


 In the above, I can see the gfid of wish.txt (e9f3aafb-3f84-4bca-8922-a00d48abc643) , which need  to be healed.
1. What are  " 00000000-0000-0000-0000-000000000001" and "00000000-0000-0000-0000-000000000005 " ?
(I can understand trusted.afr.rep-vol-client-0  as the changelog of brick1  as seen by brick2--- from  https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md)

2.  I know xattrop-* is a base file. How this is related  to the files which require  healing ? (Assuming more than one file to be healed).
    What does  the numeric part on xattrop-*   ( xattrop-b3beb437-cea4-46eb-9eb4-8d83bfa7baa1signify?

3. After brick1 is brought to online, the file is healed. Now only xattrop-* remain  under .glusterfs/indices/xattrop.
  But still  there  is  gfid entry in .glusterfs/e9/f3   directory. Is this an  expected  behavior?






On Tue, Feb 7, 2017 at 8:21 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 02/07/2017 01:32 PM, jayakrishnan mm wrote:


On Mon, Feb 6, 2017 at 6:05 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 02/06/2017 03:15 PM, jayakrishnan mm wrote:


On Mon, Feb 6, 2017 at 2:36 PM, jayakrishnan mm <jayakrishnan.mm@xxxxxxxxx> wrote:


On Fri, Feb 3, 2017 at 7:58 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 02/03/2017 09:14 AM, jayakrishnan mm wrote:


On Thu, Feb 2, 2017 at 8:17 PM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 02/02/2017 10:46 AM, jayakrishnan mm wrote:
Hi

How  do I determine, which part of the  code is run on the client, and which part of the code is run on the server nodes by merely looking at the the glusterfs  source code ?
I knew  there are client side  and server side translators which will run on respective platforms. I am looking at part of self heal daemon source  (ec/afr) which will run on the server nodes  and  the part which run on the clients.
 
The self-heal daemon that runs on the server is also a client process in the sense that it has client side xlators like ec or afr and  protocol/client (see the shd volfile 'glustershd-server.vol') loaded and talks to the bricks like a normal client does.
The difference is that only self-heal related 'logic' get executed on the shd while both self-heal and I/O related logic get executed from the mount. The self-heal logic resides mostly in afr-self-heal*.[ch] while I/O related logic is there in the other files.
HTH,
Ravi

Hi JK,
Dear  Ravi,
Thanks for your kind explanation.
So, each server node will have a separate self-heal daemon(shd) up and running , every time a child_up event occurs, and this will  be an index healer. 
And each daemon  will spawn  "priv->child_count " number of threads on each server node . correct ?
shd is always running and yes those many threads are spawned for index heal when the process starts.
1. When exactly a full healer spawns  threads?
Whenever you run `gluster volume heal volname full`. See afr_xl_op(). There are some bugs in launching full heal though.
2. When can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_INDEX happen together (so that index healer spawns thread) ?
    similarly when can GF_EVENT_TRANSLATOR_OP & GF_SHD_OP_HEAL_FULL  happen ? During replace-brick ?
Is it possible that index healer and full healer spawns threads together (so that total number of  threads  is 2*priv->child_count)?

index heal threads wake up and run once every 10 minutes or whatever the cluster.heal-timeout is. They are also run when a brick comes up like you said, via afr_notify(). It is also run when you manually launch 'gluster volume heal volname`. Again see afr_xl_op().
3. In /var/lib/glusterd/glustershd/glustershd-server.vol , why  debug/io-stats  is chosen as the top xlator ?

io-stats is generally loaded as the top most xlator in all graphs at the appropriate place for gathering profile-info, but for shd, I'm not sure if it has any specific use other than acting as a placeholder as a parent to all replica xlators.



Hi Ravi,

The self heal daemon searches   in .glusterfs/indices/xattrop   directory for the files/dirs  to be healed . Who is updating this information , and on what basis ? 


Please see https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md, it is a bit dated (relevant to AFR v1, which is in glusterfs 3.5 and older I think) but the concepts are similar. The entries are added/removed by the index translator during the pre-op/post-op phases of the AFR transaction .

Hi Ravi,

  Went  thru' the document & source code.   I see there are options to enable/disable  entry/data/metadata  change logs. If  "data-change-log" is 1 (by default , it is 1), this will  enable data change log, which results  in __changelog_enabled() to return 1  and  thereafter call afr_changelog_pre_op() . Similar logic  for post-op also, which occurs just before unlock.
Is this responsible for creating/deleting  entries  inside  .glusterfs/indices/xattrop ?
Yes, index_xattrop() adds the entry during pre-op and removes it during post-op if it was successful.

Currently I can't verify, since the mount point for the rep volume hangs  when data-change-log is set to 0. (using glusterfs v 3.7.15). Ideally, the entries  should  not appear  (in the case of  brick failure and a write thereafter)  if this option is set to '0', am I correct ?

Best Regards
JK




Thanks Ravi, for the explanation.
Regards
JK 

Regards,
Ravi
Thanks
Best regards 

Best regards
JK


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel


_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux