Patrick, I'm sure it's really firwall/switch problem. Please make sure that port and port - 1 are not blocked. For a testing purposes, you can just disable firewall completely and see if corosync works or not. Regards, Honza Patrick Hemmer napsal(a): > *From: *Steven Dake <sdake@xxxxxxxxxx> > *Sent: * 2013-09-30 18:12:25 E > *To: *Patrick Hemmer <corosync@xxxxxxxxxxxxxxx> > *CC: *discuss@xxxxxxxxxxxx > *Subject: *Re: Issue starting the CMAP service > >> On 09/30/2013 02:43 PM, Patrick Hemmer wrote: >>> *From: *Steven Dake <sdake@xxxxxxxxxx> >>> *Sent: * 2013-09-30 16:50:26 E >>> *To: *Patrick Hemmer <corosync@xxxxxxxxxxxxxxx> >>> *CC: *discuss@xxxxxxxxxxxx >>> *Subject: *Re: Issue starting the CMAP service >>> >>>> On 09/30/2013 01:45 PM, Patrick Hemmer wrote: >>>>> I'm running corosync 2.3.2 on ubuntu precise. I'm playing with a 3 >>>>> node cluster, and whenever I try to start corosync on one of the >>>>> nodes, it fails to start properly. >>>>> I just do a simple start with `corosync -f`, and whenever I try to >>>>> use any of the tools, they error: >>>>> >>>>> # corosync-cmapctl >>>>> Failed to initialize the cmap API. Error CS_ERR_TRY_AGAIN >>>>> # corosync-quorumtool >>>>> Cannot initialize CMAP service >>>>> >>>>> If I wait long enough (about 9 minutes or 530 seconds), it does end >>>>> up starting, and the tools work, but corosync-quorumtool shows the >>>>> only member is itself. >>>>> >>>>> However if I start corosync with `strace -f corosync -f` the tools >>>>> work fine immediately upon start (though it still doesn't show the >>>>> other nodes). Smells like race condition, but dunno where to begin. >>>>> >>>>> >>>> >>>> My guess is something is wrong with your network relating to >>>> multicast. Try using udpu mode - it is very stable now and removes >>>> multicast from the list of things that can go wrong. >>>> >>> >>> I am using udpu, see the config :-) >>> >>> >> I assume you have the same config on all nodes? If so, try using ip >> addresses for the ring id. possibly a DNS resolution problem? >> >> Other then that, I'm stumped > > Yes, exact same config on all nodes. All hosts are present in > /etc/hosts. Also when I do a tcpdump on the other nodes, I see traffic > on port 5405 coming from the node in question. > >> >> Regards >> -steve >> >>>> Regards >>>> -steve >>>> >>>>> >>>>> This is the output from `corosync -f` (this node is 10.20.0.212): >>>>> notice [TOTEM ] Initializing transport (UDP/IP Unicast). >>>>> notice [TOTEM ] Initializing transmit/receive security (NSS) >>>>> crypto: none hash: none >>>>> notice [TOTEM ] The network interface [10.20.0.212] is now up. >>>>> notice [TOTEM ] adding new UDPU member {10.20.0.127} >>>>> notice [TOTEM ] adding new UDPU member {10.20.0.212} >>>>> notice [TOTEM ] adding new UDPU member {10.20.2.124} >>>>> notice [TOTEM ] A new membership (10.20.0.212:1122820) was formed. >>>>> Members joined: 2 >>>>> notice [TOTEM ] A new membership (10.20.0.127:1122824) was formed. >>>>> Members joined: 1 3 >>>>> ### here is where it pauses for almost 9 minutes ### >>>>> error [TOTEM ] FAILED TO RECEIVE >>>>> notice [TOTEM ] A new membership (10.20.0.212:1122876) was formed. >>>>> Members left: 1 3 >>>>> notice [TOTEM ] A new membership (10.20.0.212:1122936) was formed. >>>>> Members >>>>> notice [TOTEM ] A new membership (10.20.0.212:1123008) was formed. >>>>> Members >>>>> notice [TOTEM ] A new membership (10.20.0.212:1123064) was formed. >>>>> Members >>>>> notice [TOTEM ] A new membership (10.20.0.212:1123124) was formed. >>>>> Members >>>>> notice [TOTEM ] A new membership (10.20.0.212:1123180) was formed. >>>>> Members >>>>> notice [TOTEM ] A new membership (10.20.0.212:1123248) was formed. >>>>> Members >>>>> notice [TOTEM ] A new membership (10.20.0.127:1123256) was formed. >>>>> Members joined: 1 3 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> This is the config (created by `pcs` utility), it's exactly the >>>>> same on all 3 nodes, and the other 2 nodes work fine: >>>>> ---- >>>>> totem { >>>>> version: 2 >>>>> secauth: off >>>>> cluster_name: hapi-server >>>>> transport: udpu >>>>> } >>>>> >>>>> nodelist { >>>>> node { >>>>> ring0_addr: i-74eb9c2f >>>>> nodeid: 1 >>>>> } >>>>> node { >>>>> ring0_addr: i-a3bf0df9 >>>>> nodeid: 2 >>>>> } >>>>> node { >>>>> ring0_addr: i-ebcfcbb0 >>>>> nodeid: 3 >>>>> } >>>>> } >>>>> >>>>> quorum { >>>>> provider: corosync_votequorum >>>>> } >>>>> >>>>> logging { >>>>> to_syslog: yes >>>>> } >>>>> ---- >>>>> >>>>> >>>>> >>>>> -Patrick >>>>> >>>>> >>>>> _______________________________________________ >>>>> discuss mailing list >>>>> discuss@xxxxxxxxxxxx >>>>> http://lists.corosync.org/mailman/listinfo/discuss >>>> >>> >> > > > > Here's some additional info from the command line utils after waiting 9 > minutes for it to come up: > > # corosync-quorumtool > Quorum information > ------------------ > Date: Mon Sep 30 22:16:24 2013 > Quorum provider: corosync_votequorum > Nodes: 1 > Node ID: 2 > Ring ID: 1124320 > Quorate: No > > Votequorum information > ---------------------- > Expected votes: 3 > Highest expected: 3 > Total votes: 1 > Quorum: 2 Activity blocked > Flags: > > Membership information > ---------------------- > Nodeid Votes Name > 2 1 i-a3bf0df9 (local) > > > # corosync-cmapctl |grep member > runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(10.20.0.127) > runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 15 > runtime.totem.pg.mrp.srp.members.1.status (str) = joined > runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0 > runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(10.20.0.212) > runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1 > runtime.totem.pg.mrp.srp.members.2.status (str) = joined > runtime.totem.pg.mrp.srp.members.3.ip (str) = r(0) ip(10.20.2.124) > runtime.totem.pg.mrp.srp.members.3.join_count (u32) = 15 > runtime.totem.pg.mrp.srp.members.3.status (str) = joined > > > > -Patrick > > > > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss > _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss