On 12/04/16 15:02, Stefano Panella wrote: > Hi Christine, > > thanks for your input. I have checked and in the configuration with only one network I have debugging turned on as well (same corosync.conf files). > > These messages are repeating every 1-2 seconds and the reason why I think there is something wrong is that if I do operation on a sqlite3 db on the GFS2 filesystem the operations are much slower when I have the secondary network as well (and the extra logging) > The messages are just debugging messages - they are not indicative of any problem. If anything they show that everything is fine - with corosync at least. They will slow things down a little though. Chrissie > If I try to strace the sqlite3 command, it is stuck for few seconds (very similar to the period of the logging repeating) in a fcntl system call needed to lock the db file > ________________________________________ > From: linux-cluster-bounces@xxxxxxxxxx <linux-cluster-bounces@xxxxxxxxxx> on behalf of Christine Caulfield <ccaulfie@xxxxxxxxxx> > Sent: Tuesday, April 12, 2016 2:28 PM > To: linux-cluster@xxxxxxxxxx > Subject: Re: Help with corosync and GFS2 on multi network setup > > On 12/04/16 13:45, Stefano Panella wrote: >> Hi everybody, >> >> we have been using corosync directly to provide clustering for GFS2 on our centos 7.2 pools with only one network interface and all has been working great so far! >> >> We now have a new set-up with two network interfaces for every host in the cluster: >> A -> 1 Gbit (the one we would like corosync to use, 10.220.88.X) >> B -> 10 Gbit (used for iscsi connection to storage, 10.220.246.X) >> >> when we run corosync in this mode we get the logs continuously spammed by messages like these: >> >> [12880] cl15-02 corosyncdebug [TOTEM ] entering GATHER state from 0(consensus timeout). >> [12880] cl15-02 corosyncdebug [TOTEM ] Creating commit token because I am the rep. >> [12880] cl15-02 corosyncdebug [TOTEM ] Saving state aru 10 high seq received 10 >> [12880] cl15-02 corosyncdebug [MAIN ] Storing new sequence id for ring 5750 >> [12880] cl15-02 corosyncdebug [TOTEM ] entering COMMIT state. >> [12880] cl15-02 corosyncdebug [TOTEM ] got commit token >> [12880] cl15-02 corosyncdebug [TOTEM ] entering RECOVERY state. >> [12880] cl15-02 corosyncdebug [TOTEM ] TRANS [0] member 10.220.88.41: >> [12880] cl15-02 corosyncdebug [TOTEM ] TRANS [1] member 10.220.88.47: >> [12880] cl15-02 corosyncdebug [TOTEM ] position [0] member 10.220.88.41: >> [12880] cl15-02 corosyncdebug [TOTEM ] previous ring seq 574c rep 10.220.88.41 >> [12880] cl15-02 corosyncdebug [TOTEM ] aru 10 high delivered 10 received flag 1 >> [12880] cl15-02 corosyncdebug [TOTEM ] position [1] member 10.220.88.47: >> [12880] cl15-02 corosyncdebug [TOTEM ] previous ring seq 574c rep 10.220.88.41 >> [12880] cl15-02 corosyncdebug [TOTEM ] aru 10 high delivered 10 received flag 1 >> >> [12880] cl15-02 corosyncdebug [TOTEM ] Did not need to originate any messages in recovery. >> [12880] cl15-02 corosyncdebug [TOTEM ] got commit token >> [12880] cl15-02 corosyncdebug [TOTEM ] Sending initial ORF token >> [12880] cl15-02 corosyncdebug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] install seq 0 aru 0 high seq received 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] install seq 0 aru 0 high seq received 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] install seq 0 aru 0 high seq received 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] install seq 0 aru 0 high seq received 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0 >> [12880] cl15-02 corosyncdebug [TOTEM ] Resetting old ring state >> [12880] cl15-02 corosyncdebug [TOTEM ] recovery to regular 1-0 >> [12880] cl15-02 corosyncdebug [TOTEM ] waiting_trans_ack changed to 1 >> Apr 11 16:19:54 [13372] cl15-02 pacemakerd: info: pcmk_quorum_notification: Membership 22352: quorum retained (2) >> Apr 11 16:19:54 [13378] cl15-02 crmd: info: pcmk_quorum_notification: Membership 22352: quorum retained (2) >> [12880] cl15-02 corosyncdebug [TOTEM ] entering OPERATIONAL state. >> [12880] cl15-02 corosyncnotice [TOTEM ] A new membership (10.220.88.41:22352) was formed. Members >> [12880] cl15-02 corosyncdebug [SYNC ] Committing synchronization for corosync configuration map access >> Apr 11 16:19:54 [13373] cl15-02 cib: info: cib_process_request: Forwarding cib_modify operation for section nodes to master (origin=local/crmd/27157) >> [12880] cl15-02 corosyncdebug [CMAP ] Not first sync -> no action >> Apr 11 16:19:54 [13373] cl15-02 cib: info: cib_process_request: Forwarding cib_modify operation for section status to master (origin=local/crmd/27158) >> [12880] cl15-02 corosyncdebug [CPG ] got joinlist message from node 0x2 >> [12880] cl15-02 corosyncdebug [CPG ] comparing: sender r(0) ip(10.220.88.41) ; members(old:2 left:0) >> [12880] cl15-02 corosyncdebug [CPG ] comparing: sender r(0) ip(10.220.88.47) ; members(old:2 left:0) >> [12880] cl15-02 corosyncdebug [CPG ] chosen downlist: sender r(0) ip(10.220.88.41) ; members(old:2 left:0) >> [12880] cl15-02 corosyncdebug [CPG ] got joinlist message from node 0x1 >> [12880] cl15-02 corosyncdebug [SYNC ] Committing synchronization for corosync cluster closed process group service v1.01 >> Apr 11 16:19:54 [13373] cl15-02 cib: info: cib_process_request: Completed cib_modify operation for section nodes: OK (rc=0, origin=cl15-02/crmd/27157, version=0.18.22) >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[0] group:clvmd, ip:r(0) ip(10.220.88.41) , pid:35677 >> Apr 11 16:19:54 [13373] cl15-02 cib: info: cib_process_request: Completed cib_modify operation for section status: OK (rc=0, origin=cl15-02/crmd/27158, version=0.18.22) >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[1] group:dlm:ls:clvmd\x00, ip:r(0) ip(10.220.88.41) , pid:34995 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[2] group:dlm:controld\x00, ip:r(0) ip(10.220.88.41) , pid:34995 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[3] group:crmd\x00, ip:r(0) ip(10.220.88.41) , pid:13378 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[4] group:attrd\x00, ip:r(0) ip(10.220.88.41) , pid:13376 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[5] group:stonith-ng\x00, ip:r(0) ip(10.220.88.41) , pid:13374 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[6] group:cib\x00, ip:r(0) ip(10.220.88.41) , pid:13373 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[7] group:pacemakerd\x00, ip:r(0) ip(10.220.88.41) , pid:13372 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[8] group:crmd\x00, ip:r(0) ip(10.220.88.47) , pid:12879 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[9] group:attrd\x00, ip:r(0) ip(10.220.88.47) , pid:12877 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[10] group:stonith-ng\x00, ip:r(0) ip(10.220.88.47) , pid:12875 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[11] group:cib\x00, ip:r(0) ip(10.220.88.47) , pid:12874 >> [12880] cl15-02 corosyncdebug [CPG ] joinlist_messages[12] group:pacemakerd\x00, ip:r(0) ip(10.220.88.47) , pid:12873 >> [12880] cl15-02 corosyncdebug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No >> [12880] cl15-02 corosyncdebug [VOTEQ ] got nodeinfo message from cluster node 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 3 flags: 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No >> [12880] cl15-02 corosyncdebug [VOTEQ ] total_votes=2, expected_votes=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] node 1 state=1, votes=1, expected=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] node 2 state=1, votes=1, expected=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] node 3 state=2, votes=1, expected=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] lowest node id: 1 us: 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] highest node id: 2 us: 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] got nodeinfo message from cluster node 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 >> [12880] cl15-02 corosyncdebug [VOTEQ ] got nodeinfo message from cluster node 2 >> [12880] cl15-02 corosyncdebug [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 3 flags: 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No >> [12880] cl15-02 corosyncdebug [VOTEQ ] got nodeinfo message from cluster node 2 >> [12880] cl15-02 corosyncdebug [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0 >> [12880] cl15-02 corosyncdebug [SYNC ] Committing synchronization for corosync vote quorum service v1.0 >> [12880] cl15-02 corosyncdebug [VOTEQ ] total_votes=2, expected_votes=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] node 1 state=1, votes=1, expected=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] node 2 state=1, votes=1, expected=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] node 3 state=2, votes=1, expected=3 >> [12880] cl15-02 corosyncdebug [VOTEQ ] lowest node id: 1 us: 1 >> [12880] cl15-02 corosyncdebug [VOTEQ ] highest node id: 2 us: 1 >> [12880] cl15-02 corosyncnotice [QUORUM] Members[2]: 1 2 >> [12880] cl15-02 corosyncdebug [QUORUM] sending quorum notification to (nil), length = 56 >> [12880] cl15-02 corosyncnotice [MAIN ] Completed service synchronization, ready to provide service. >> [12880] cl15-02 corosyncdebug [TOTEM ] waiting_trans_ack changed to 0 >> [12880] cl15-02 corosyncdebug [QUORUM] got quorate request on 0x7f5a907749a0 >> [12880] cl15-02 corosyncdebug [TOTEM ] entering GATHER state from 11(merge during join). >> >> >> and we do not get them when there is only a single network interface in the systems. >> >> -------------------------------------------------------------------------------------- >> These are the network configurations on the three hosts: >> >> [root@cl15-02 ~]# ifconfig | grep inet >> inet 10.220.88.41 netmask 255.255.248.0 broadcast 10.220.95.255 >> inet 10.220.246.50 netmask 255.255.255.0 broadcast 10.220.246.255 >> inet 127.0.0.1 netmask 255.0.0.0 >> >> [root@cl15-08 ~]# ifconfig | grep inet >> inet 10.220.88.47 netmask 255.255.248.0 broadcast 10.220.95.255 >> inet 10.220.246.51 netmask 255.255.255.0 broadcast 10.220.246.255 >> inet 127.0.0.1 netmask 255.0.0.0 >> >> [root@cl15-09 ~]# ifconfig | grep inet >> inet 10.220.88.48 netmask 255.255.248.0 broadcast 10.220.95.255 >> inet 10.220.246.59 netmask 255.255.255.0 broadcast 10.220.246.255 >> inet 127.0.0.1 netmask 255.0.0.0 >> >> ----------------------------------------------------------------------------------- >> corosync-quorumtool output: >> >> [root@cl15-02 ~]# corosync-quorumtool >> Quorum information >> ------------------ >> Date: Mon Apr 11 15:46:26 2016 >> Quorum provider: corosync_votequorum >> Nodes: 3 >> Node ID: 1 >> Ring ID: 18952 >> Quorate: Yes >> >> Votequorum information >> ---------------------- >> Expected votes: 3 >> Highest expected: 3 >> Total votes: 3 >> Quorum: 2 >> Flags: Quorate >> >> Membership information >> ---------------------- >> Nodeid Votes Name >> 1 1 cl15-02 (local) >> 2 1 cl15-08 >> 3 1 cl15-09 >> >> --------------------------------------------------------------------------- >> /etc/corosync/corosync.conf: >> >> [root@cl15-02 ~]# cat /etc/corosync/corosync.conf >> totem { >> version: 2 >> secauth: off >> cluster_name: gfs_cluster >> transport: udpu >> } >> >> nodelist { >> node { >> ring0_addr: cl15-02 >> nodeid: 1 >> } >> >> node { >> ring0_addr: cl15-08 >> nodeid: 2 >> } >> >> node { >> ring0_addr: cl15-09 >> nodeid: 3 >> } >> } >> >> quorum { >> provider: corosync_votequorum >> } >> >> logging { >> debug: on > > > You have debug logging on. At a guess I would say that the config file > with the other interface in it doesn't :) > > Chrissie > > >> to_logfile: yes >> logfile: /var/log/cluster/corosync.log >> to_syslog: yes >> } >> > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster