Re: Help with corosync and GFS2 on multi network setup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/04/16 15:02, Stefano Panella wrote:
> Hi Christine,
> 
> thanks for your input. I have checked and in the configuration with only one network I have debugging turned on as well (same corosync.conf files).
> 
> These messages are repeating every 1-2 seconds and the reason why I think there is something wrong is that if I do operation on a sqlite3 db on the GFS2 filesystem the operations are much slower when I have the secondary network as well (and the extra logging)
> 

The messages are just debugging messages - they are not indicative of
any problem. If anything they show that everything is fine - with
corosync at least. They will slow things down a little though.

Chrissie

> If I try to strace the sqlite3 command, it is stuck for few seconds (very similar to the period of the logging repeating) in a fcntl system call needed to lock the db file
> ________________________________________
> From: linux-cluster-bounces@xxxxxxxxxx <linux-cluster-bounces@xxxxxxxxxx> on behalf of Christine Caulfield <ccaulfie@xxxxxxxxxx>
> Sent: Tuesday, April 12, 2016 2:28 PM
> To: linux-cluster@xxxxxxxxxx
> Subject: Re:  Help with corosync and GFS2 on multi network setup
> 
> On 12/04/16 13:45, Stefano Panella wrote:
>> Hi everybody,
>>
>> we have been using corosync directly to provide clustering for GFS2 on our centos 7.2 pools with only one network interface and all has been working great so far!
>>
>> We now have a new set-up with two network interfaces for every host in the cluster:
>> A -> 1 Gbit (the one we would like corosync to use, 10.220.88.X)
>> B -> 10 Gbit (used for iscsi connection to storage, 10.220.246.X)
>>
>> when we run corosync in this mode we get the logs continuously spammed by messages like these:
>>
>> [12880] cl15-02 corosyncdebug   [TOTEM ] entering GATHER state from 0(consensus timeout).
>> [12880] cl15-02 corosyncdebug   [TOTEM ] Creating commit token because I am the rep.
>> [12880] cl15-02 corosyncdebug   [TOTEM ] Saving state aru 10 high seq received 10
>> [12880] cl15-02 corosyncdebug   [MAIN  ] Storing new sequence id for ring 5750
>> [12880] cl15-02 corosyncdebug   [TOTEM ] entering COMMIT state.
>> [12880] cl15-02 corosyncdebug   [TOTEM ] got commit token
>> [12880] cl15-02 corosyncdebug   [TOTEM ] entering RECOVERY state.
>> [12880] cl15-02 corosyncdebug   [TOTEM ] TRANS [0] member 10.220.88.41:
>> [12880] cl15-02 corosyncdebug   [TOTEM ] TRANS [1] member 10.220.88.47:
>> [12880] cl15-02 corosyncdebug   [TOTEM ] position [0] member 10.220.88.41:
>> [12880] cl15-02 corosyncdebug   [TOTEM ] previous ring seq 574c rep 10.220.88.41
>> [12880] cl15-02 corosyncdebug   [TOTEM ] aru 10 high delivered 10 received flag 1
>> [12880] cl15-02 corosyncdebug   [TOTEM ] position [1] member 10.220.88.47:
>> [12880] cl15-02 corosyncdebug   [TOTEM ] previous ring seq 574c rep 10.220.88.41
>> [12880] cl15-02 corosyncdebug   [TOTEM ] aru 10 high delivered 10 received flag 1
>>
>> [12880] cl15-02 corosyncdebug   [TOTEM ] Did not need to originate any messages in recovery.
>> [12880] cl15-02 corosyncdebug   [TOTEM ] got commit token
>> [12880] cl15-02 corosyncdebug   [TOTEM ] Sending initial ORF token
>> [12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] install seq 0 aru 0 high seq received 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] Resetting old ring state
>> [12880] cl15-02 corosyncdebug   [TOTEM ] recovery to regular 1-0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] waiting_trans_ack changed to 1
>> Apr 11 16:19:54 [13372] cl15-02 pacemakerd:     info: pcmk_quorum_notification: Membership 22352: quorum retained (2)
>> Apr 11 16:19:54 [13378] cl15-02       crmd:     info: pcmk_quorum_notification: Membership 22352: quorum retained (2)
>> [12880] cl15-02 corosyncdebug   [TOTEM ] entering OPERATIONAL state.
>> [12880] cl15-02 corosyncnotice  [TOTEM ] A new membership (10.220.88.41:22352) was formed. Members
>> [12880] cl15-02 corosyncdebug   [SYNC  ] Committing synchronization for corosync configuration map access
>> Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      Forwarding cib_modify operation for section nodes to master (origin=local/crmd/27157)
>> [12880] cl15-02 corosyncdebug   [CMAP  ] Not first sync -> no action
>> Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      Forwarding cib_modify operation for section status to master (origin=local/crmd/27158)
>> [12880] cl15-02 corosyncdebug   [CPG   ] got joinlist message from node 0x2
>> [12880] cl15-02 corosyncdebug   [CPG   ] comparing: sender r(0) ip(10.220.88.41) ; members(old:2 left:0)
>> [12880] cl15-02 corosyncdebug   [CPG   ] comparing: sender r(0) ip(10.220.88.47) ; members(old:2 left:0)
>> [12880] cl15-02 corosyncdebug   [CPG   ] chosen downlist: sender r(0) ip(10.220.88.41) ; members(old:2 left:0)
>> [12880] cl15-02 corosyncdebug   [CPG   ] got joinlist message from node 0x1
>> [12880] cl15-02 corosyncdebug   [SYNC  ] Committing synchronization for corosync cluster closed process group service v1.01
>> Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      Completed cib_modify operation for section nodes: OK (rc=0, origin=cl15-02/crmd/27157, version=0.18.22)
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[0] group:clvmd, ip:r(0) ip(10.220.88.41) , pid:35677
>> Apr 11 16:19:54 [13373] cl15-02        cib:     info: cib_process_request:      Completed cib_modify operation for section status: OK (rc=0, origin=cl15-02/crmd/27158, version=0.18.22)
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[1] group:dlm:ls:clvmd\x00, ip:r(0) ip(10.220.88.41) , pid:34995
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[2] group:dlm:controld\x00, ip:r(0) ip(10.220.88.41) , pid:34995
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[3] group:crmd\x00, ip:r(0) ip(10.220.88.41) , pid:13378
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[4] group:attrd\x00, ip:r(0) ip(10.220.88.41) , pid:13376
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[5] group:stonith-ng\x00, ip:r(0) ip(10.220.88.41) , pid:13374
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[6] group:cib\x00, ip:r(0) ip(10.220.88.41) , pid:13373
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[7] group:pacemakerd\x00, ip:r(0) ip(10.220.88.41) , pid:13372
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[8] group:crmd\x00, ip:r(0) ip(10.220.88.47) , pid:12879
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[9] group:attrd\x00, ip:r(0) ip(10.220.88.47) , pid:12877
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[10] group:stonith-ng\x00, ip:r(0) ip(10.220.88.47) , pid:12875
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[11] group:cib\x00, ip:r(0) ip(10.220.88.47) , pid:12874
>> [12880] cl15-02 corosyncdebug   [CPG   ] joinlist_messages[12] group:pacemakerd\x00, ip:r(0) ip(10.220.88.47) , pid:12873
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[1]: votes: 1, expected: 3 flags: 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] total_votes=2, expected_votes=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] node 1 state=1, votes=1, expected=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] node 2 state=1, votes=1, expected=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] node 3 state=2, votes=1, expected=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] lowest node id: 1 us: 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] highest node id: 2 us: 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 2
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[2]: votes: 1, expected: 3 flags: 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] flags: quorate: Yes Leaving: No WFA Status: No First: No Qdevice: No QdeviceAlive: No QdeviceCastVote: No QdeviceMasterWins: No
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] got nodeinfo message from cluster node 2
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] nodeinfo message[0]: votes: 0, expected: 0 flags: 0
>> [12880] cl15-02 corosyncdebug   [SYNC  ] Committing synchronization for corosync vote quorum service v1.0
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] total_votes=2, expected_votes=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] node 1 state=1, votes=1, expected=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] node 2 state=1, votes=1, expected=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] node 3 state=2, votes=1, expected=3
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] lowest node id: 1 us: 1
>> [12880] cl15-02 corosyncdebug   [VOTEQ ] highest node id: 2 us: 1
>> [12880] cl15-02 corosyncnotice  [QUORUM] Members[2]: 1 2
>> [12880] cl15-02 corosyncdebug   [QUORUM] sending quorum notification to (nil), length = 56
>> [12880] cl15-02 corosyncnotice  [MAIN  ] Completed service synchronization, ready to provide service.
>> [12880] cl15-02 corosyncdebug   [TOTEM ] waiting_trans_ack changed to 0
>> [12880] cl15-02 corosyncdebug   [QUORUM] got quorate request on 0x7f5a907749a0
>> [12880] cl15-02 corosyncdebug   [TOTEM ] entering GATHER state from 11(merge during join).
>>
>>
>> and we do not get them when there is only a single network interface in the systems.
>>
>> --------------------------------------------------------------------------------------
>> These are the network configurations on the three hosts:
>>
>> [root@cl15-02 ~]# ifconfig | grep inet
>>         inet 10.220.88.41  netmask 255.255.248.0  broadcast 10.220.95.255
>>         inet 10.220.246.50  netmask 255.255.255.0  broadcast 10.220.246.255
>>         inet 127.0.0.1  netmask 255.0.0.0
>>
>> [root@cl15-08 ~]# ifconfig | grep inet
>>         inet 10.220.88.47  netmask 255.255.248.0  broadcast 10.220.95.255
>>         inet 10.220.246.51  netmask 255.255.255.0  broadcast 10.220.246.255
>>         inet 127.0.0.1  netmask 255.0.0.0
>>
>> [root@cl15-09 ~]# ifconfig | grep inet
>>         inet 10.220.88.48  netmask 255.255.248.0  broadcast 10.220.95.255
>>         inet 10.220.246.59  netmask 255.255.255.0  broadcast 10.220.246.255
>>         inet 127.0.0.1  netmask 255.0.0.0
>>
>> -----------------------------------------------------------------------------------
>> corosync-quorumtool output:
>>
>> [root@cl15-02 ~]# corosync-quorumtool
>> Quorum information
>> ------------------
>> Date:             Mon Apr 11 15:46:26 2016
>> Quorum provider:  corosync_votequorum
>> Nodes:            3
>> Node ID:          1
>> Ring ID:          18952
>> Quorate:          Yes
>>
>> Votequorum information
>> ----------------------
>> Expected votes:   3
>> Highest expected: 3
>> Total votes:      3
>> Quorum:           2
>> Flags:            Quorate
>>
>> Membership information
>> ----------------------
>>     Nodeid      Votes Name
>>          1          1 cl15-02 (local)
>>          2          1 cl15-08
>>          3          1 cl15-09
>>
>> ---------------------------------------------------------------------------
>> /etc/corosync/corosync.conf:
>>
>> [root@cl15-02 ~]# cat /etc/corosync/corosync.conf
>> totem {
>>     version: 2
>>     secauth: off
>>     cluster_name: gfs_cluster
>>     transport: udpu
>> }
>>
>> nodelist {
>>     node {
>>         ring0_addr: cl15-02
>>         nodeid: 1
>>     }
>>
>>     node {
>>         ring0_addr: cl15-08
>>         nodeid: 2
>>     }
>>
>>     node {
>>         ring0_addr: cl15-09
>>         nodeid: 3
>>     }
>> }
>>
>> quorum {
>>     provider: corosync_votequorum
>> }
>>
>> logging {
>>     debug: on
> 
> 
> You have debug logging on. At a guess I would say that the config file
> with the other interface in it doesn't :)
> 
> Chrissie
> 
> 
>>     to_logfile: yes
>>     logfile: /var/log/cluster/corosync.log
>>     to_syslog: yes
>> }
>>
> 
> --
> Linux-cluster mailing list
> Linux-cluster@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster



[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux