> >Hi, > >On Wed, Jul 25, 2012 at 7:57 PM, Legault Mélanie ><melanie.legault.2@xxxxxxxxxxxx> wrote: > >> Hello, >> >> I have a 3 nodes clusters, I just migrated one from Fedora 13 to RHEL6.2 >> >> I copied the /etc/corosync.conf files to the upgraded server, started corosync but it seam that the rhel server is not able to join the existing cluster. >> >> Status on others 2 nodes show: >> Last updated: Wed Jul 25 12:19:58 2012 >> Stack: openais >> Current DC: node2 - partition with quorum >> Version: 1.1.4-ac608e3491c7dfc3b3e3c36d966ae9b016f77065 >> 3 Nodes configured, 3 expected votes >> 4 Resources configured. >> ============ >> >> Online: [ node1 node2 ] >> OFFLINE: [ node3 ] >> >> on node3: >> >> ============ >> Last updated: Wed Jul 25 12:28:34 2012 >> Last change: Wed Jul 25 11:26:06 2012 via crmd on node3 >> Stack: openais >> Current DC: NONE >> 1 Nodes configured, 2 expected votes >> 0 Resources configured. >> ============ >> >> Node node3: UNCLEAN (offline) >> >> then after a few minutes change for >> >> Online [ node3 ] >> >> here is the /etc/corosync/corosync.conf files on all 3 servers. >> >> compatibility: whitetank >> >> totem { >> token: 5000 >> token_retransmits_before_loss_const: 20 >> join: 1000 >> consensus: 7500 >> vfstype: none >> version: 2 >> secauth: off >> threads: 0 >> interface { >> ringnumber: 0 >> bindnetaddr: 10.11.12.0 >> mcastaddr: 239.255.0.0 >> mcastport: 5555 >> } >> >> } >> >> logging { >> fileline: off >> to_stderr: no >> to_logfile: yes >> to_syslog: no >> syslog_facility: daemon >> logfile: /var/log/cluster/corosync.log >> debug: off >> timestamp: on >> #logger_subsys { >> # subsys: AMF >> # debug: off >> #} >> } >> >> amf { >> mode: disabled >> } >> >> >> I tried to import the CIB files saved by a working node into node3 but I add an error: >> Signon to CIB failed: connection failed >> Init failed, could not perform requested operations >> ERROR: cannot parse xml: no element found: line 1, column 0 >> ERROR: No CIB! >> >> >> if I run corosync-objctl on node3 I have the folowing >> ... >> runtime.totem.pg.mrp.srp.members.274761738.ip=r(0) ip(10.11.12.11) >> runtime.totem.pg.mrp.srp.members.274761738.join_count=1 >> runtime.totem.pg.mrp.srp.members.274761738.status=joined >> runtime.totem.pg.mrp.srp.members.174098442.ip=r(0) ip(10.11.12.12) >> runtime.totem.pg.mrp.srp.members.174098442.join_count=1 >> runtime.totem.pg.mrp.srp.members.174098442.status=joined >> runtime.totem.pg.mrp.srp.members.190875658.ip=r(0) ip(10.11.12.13) >> runtime.totem.pg.mrp.srp.members.190875658.join_count=1 >> runtime.totem.pg.mrp.srp.members.190875658.status=joined >> ... >> as if node3 can see others nodes > >What versions of corosync and pacemaker do you have on node 3? What >version of corosync is on nodes 1 and 2? node 1 & 2 corosync 1.3.1-1.fc13 pacemaker 1.1.4-5.fc13 node 3 corosync 1.4.1-7.el6 pacemaker 1.1.7-6.el6 > >> >> I do have the following in the log files: >> Jul 25 12:43:37 [8203] node3 pengine: error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined >> Jul 25 12:43:37 [8203] node3 pengine: error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option >> Jul 25 12:43:37 [8203] node3 pengine: error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity > >This is a normal message, it means the node doesn't have STONITH, this >happens because it can't see the rest of the cluster (which would also >mean it would get the cluster config from the other nodes). > How come node 3 can't see node 1 & 2 if it can see them as shown with corosync-objctl output (see higher)? >> >> >> Could you provide me with hint of what to do? Firewall is not in cause (I did a test by disabling it all). Are Fedora and RHEL RPM based packages incompatible? > >Between Fedora 13 and RHEL 6 it may or may not work, so the answer >would be, it depends. Best thing for a rolling upgrade would be to put >the cluster in maintenance-mode, upgrade all software, make sure it >works, refresh, reprobe, and if all is ok, take the cluster out of >maintenance mode. > >HTH, >Dan > >> >> Thanks, >> Mélanie >> _______________________________________________ >> discuss mailing list >> discuss@xxxxxxxxxxxx >> http://lists.corosync.org/mailman/listinfo/discuss > > > >-- >Dan Frincu >CCNA, RHCE >_______________________________________________ >discuss mailing list >discuss@xxxxxxxxxxxx >http://lists.corosync.org/mailman/listinfo/discuss thanks, Mélanie _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss