答复: 答复: 答复: geo-replication status partial faulty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have checked all the nodes both on masters and slaves, the software is the same.

 

I am puzzled why there were half masters work, halt faulty.   

 

 

[admin@SVR6996HW2285 ~]$ rpm -qa |grep gluster

glusterfs-api-3.6.3-1.el6.x86_64

glusterfs-fuse-3.6.3-1.el6.x86_64

glusterfs-geo-replication-3.6.3-1.el6.x86_64

glusterfs-3.6.3-1.el6.x86_64

glusterfs-cli-3.6.3-1.el6.x86_64

glusterfs-server-3.6.3-1.el6.x86_64

glusterfs-libs-3.6.3-1.el6.x86_64

 

 

 

 

Best Regards

杨雨阳 Yuyang Yang

OPS

Ctrip Infrastructure Service (CIS)

Ctrip Computer Technology (Shanghai) Co., Ltd 
Phone: + 86 21 34064880-15554 | Fax: + 86 21 52514588-13389
Web: 
www.Ctrip.com

 

 

发件人: Saravanakumar Arumugam [mailto:sarumuga@xxxxxxxxxx]
发送时间: Thursday, May 19, 2016 4:33 PM
收件人: vyyy杨雨阳 <yuyangyang@xxxxxxxxx>; Gluster-users@xxxxxxxxxxx; Aravinda Vishwanathapura Krishna Murthy <avishwan@xxxxxxxxxx>; Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx>
主题: Re: 答复: [Gluster-users] 答复: geo-replication status partial faulty

 

Hi,
+geo-rep team.

Can you get the gluster version you are using?

# For example:
rpm -qa | grep gluster

I hope you have same gluster version installed everywhere.
Please double check and share the same.

Thanks,
Saravana

On 05/19/2016 01:37 PM, vyyy杨雨阳 wrote:

Hi, Saravana

 

I have changed log level to DEBUG. Then start geo-replication with log-file option, attached the file.

 

gluster volume geo-replication filews glusterfs01.sh3.ctripcorp.com::filews_slave start --log-file=geo.log

 

I have checked  /root/.ssh/authorized_keys in glusterfs01.sh3.ctripcorp.com , It  have entries in /var/lib/glusterd/geo-replication/common_secret.pem.pub.  and I have removed the lines not started with command=  

 

ssh -i /var/lib/glusterd/geo-replication/secret.pem  root@ glusterfs01.sh3.ctripcorp.com

I can see gsyncd messages and no ssh error.

 

 

Attached etc-glusterfs-glusterd.vol.log from faulty node, it shows :

 

[2016-05-19 06:39:23.405974] I [glusterd-geo-rep.c:3516:glusterd_read_status_file] 0-: Using passed config template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).

[2016-05-19 06:39:23.541169] E [glusterd-geo-rep.c:3200:glusterd_gsync_read_frm_status] 0-: Unable to read gsyncd status file

[2016-05-19 06:39:23.541210] E [glusterd-geo-rep.c:3603:glusterd_read_status_file] 0-: Unable to read the statusfile for /export/sdb/filews brick for  filews(master), glusterfs01.sh3.ctripcorp.com::filews_slave(slave) session

[2016-05-19 06:39:29.472047] I [glusterd-geo-rep.c:1835:glusterd_get_statefile_name] 0-: Using passed config template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).

[2016-05-19 06:39:34.939709] I [glusterd-geo-rep.c:3516:glusterd_read_status_file] 0-: Using passed config template(/var/lib/glusterd/geo-replication/filews_glusterfs01.sh3.ctripcorp.com_filews_slave/gsyncd.conf).

[2016-05-19 06:39:35.058520] E [glusterd-geo-rep.c:3200:glusterd_gsync_read_frm_status] 0-: Unable to read gsyncd status file

 

 

/var/log/glusterfs/geo-replication/filews/ ssh%3A%2F%2Froot%4010.15.65.66%3Agluster%3A%2F%2F127.0.0.1%3Afilews_slave.log  shows as following:

 

[2016-05-19 15:11:37.307755] I [monitor(monitor):215:monitor] Monitor: ------------------------------------------------------------

[2016-05-19 15:11:37.308059] I [monitor(monitor):216:monitor] Monitor: starting gsyncd worker

[2016-05-19 15:11:37.423320] D [gsyncd(agent):627:main_i] <top>: rpc_fd: '7,11,10,9'

[2016-05-19 15:11:37.423882] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining...

[2016-05-19 15:11:37.423906] I [monitor(monitor):267:monitor] Monitor: worker(/export/sdb/filews) died before establishing connection

[2016-05-19 15:11:37.424151] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.

[2016-05-19 15:11:37.424335] I [syncdutils(agent):214:finalize] <top>: exiting.

 

 

 

 

 

 

Best Regards

Yuyang Yang

 

 

 

发 件人: Saravanakumar Arumugam [mailto:sarumuga@xxxxxxxxxx]
发送时间: Thursday, May 19, 2016 1:59 PM
收件人: vyyy杨雨阳 <yuyangyang@xxxxxxxxx>; Gluster-users@xxxxxxxxxxx
主题: Re: [Gluster-users] 答复: geo-replication status partial faulty

 

Hi,

There seems to be some issue in glusterfs01.sh3.ctripcorp.com slave node.
Can you share the complete logs ?

You can increase verbosity of debug messages like this:
gluster volume geo-replication <master volume> <slave host>::<slave volume> config log-level DEBUG


Also, check  /root/.ssh/authorized_keys in glusterfs01.sh3.ctripcorp.com
It should have entries in /var/lib/glusterd/geo-replication/common_secret.pem.pub (present in master node).

Have a look at this one for example:
https://www.gluster.org/pipermail/gluster-users/2015-August/023174.html

Thanks,
Saravana

On 05/19/2016 07:53 AM, vyyy杨雨阳 wrote:

Hello,

 

I have tried to config a geo-replication volume , all the master nodes configuration are the same, When I start this volume, the status shows partial faulty as following:

 

gluster volume geo-replication filews glusterfs01.sh3.ctripcorp.com::filews_slave status

 

MASTER NODE      MASTER VOL    MASTER BRICK          SLAVE                                          STATUS     CHECKPOINT STATUS    CRAWL STATUS       

-------------------------------------------------------------------------------------------------------------------------------------------------

SVR8048HW2285    filews        /export/sdb/filews    glusterfs01.sh3.ctripcorp.com::filews_slave    faulty     N/A                  N/A                

SVR8050HW2285    filews        /export/sdb/filews    glusterfs03.sh3.ctripcorp.com::filews_slave    Passive    N/A                  N/A                

SVR8047HW2285    filews        /export/sdb/filews    glusterfs01.sh3.ctripcorp.com::filews_slave    Active     N/A                  Hybrid Crawl       

SVR8049HW2285    filews        /export/sdb/filews    glusterfs05.sh3.ctripcorp.com::filews_slave    Active     N/A                  Hybrid Crawl       

SH02SVR5951      filews        /export/sdb/brick1    glusterfs06.sh3.ctripcorp.com::filews_slave    Passive    N/A                  N/A                

SH02SVR5953      filews        /export/sdb/brick1    glusterfs01.sh3.ctripcorp.com::filews_slave    faulty     N/A                  N/A                

SVR6995HW2285    filews        /export/sdb/filews    glusterfs01.sh3.ctripcorp.com::filews_slave    faulty     N/A                  N/A                

SH02SVR5954      filews        /export/sdb/brick1    glusterfs01.sh3.ctripcorp.com::filews_slave    faulty     N/A                  N/A                

SVR6994HW2285    filews        /export/sdb/filews    glusterfs02.sh3.ctripcorp.com::filews_slave    Passive    N/A                  N/A                

SVR6993HW2285    filews        /export/sdb/filews    glusterfs01.sh3.ctripcorp.com::filews_slave    faulty     N/A                  N/A                

SH02SVR5952      filews        /export/sdb/brick1    glusterfs01.sh3.ctripcorp.com::filews_slave    faulty     N/A                  N/A                

SVR6996HW2285    filews        /export/sdb/filews    glusterfs04.sh3.ctripcorp.com::filews_slave    Passive    N/A                  N/A   

 

On the faulty node, log file /var/log/glusterfs/geo-replication/filews shows worker(/export/sdb/filews) died before establishing connection

 

[2016-05-18 16:55:46.402622] I [monitor(monitor):215:monitor] Monitor: ------------------------------------------------------------

[2016-05-18 16:55:46.402930] I [monitor(monitor):216:monitor] Monitor: starting gsyncd worker

[2016-05-18 16:55:46.517460] I [changelogagent(agent):72:__init__] ChangelogAgent: Agent listining...

[2016-05-18 16:55:46.518066] I [repce(agent):92:service_loop] RepceServer: terminating on reaching EOF.

[2016-05-18 16:55:46.518279] I [syncdutils(agent):214:finalize] <top>: exiting.

[2016-05-18 16:55:46.518194] I [monitor(monitor):267:monitor] Monitor: worker(/export/sdb/filews) died before establishing connection

[2016-05-18 16:55:56.697036] I [monitor(monitor):215:monitor] Monitor: ------------------------------------------------------------

 

Any advice and suggestions will be greatly appreciated.

 

 

 

 

 

Best Regards

������ Yuyang Yang

 





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

 

 

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux