Hi Lon, On Fri, 2007-01-12 at 16:05 -0500, Lon Hohberger wrote: > On Fri, 2007-01-12 at 14:59 +0100, Simone Gotti wrote: > > Hi all, > > > > On a 2 node openais cman cluster, I failed a network interface and > > noticed that it didn't failed over the other node. > > > > Looking at the rgmanager-2.0.16 code I noticed that: > > > > handle_relocate_req is called with preferred_target = -1, but inside > > this function, there are 2 checks to see if the preferred_target is > > setted, the check is a 'if (preferred_target != 0)' so the function > > thinks that a preferred target is choosed. Then, inside the cycle, the > > only one target that really exists is "me" (as -1 isn't a real target) > > and there a "goto exausted:", the service is then restarted only on the > > locale node, where it fails again and so it's stopped. Changing these > > checks to "> 0" worked. > > > > Before writing a patch I noticed that in the RHEL4 CVS tag is used a > > NODE_ID_NONE instead of the numeric values, so the problem (not tested) > > probably doesn't happen. > > Is it probably a forgotten patch on HEAD and RHEL5? > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=222485 > > Please attach your patch if you have it; I wrote one, but yours is > already tested :) (or you can send it here, too) My patch is the same as yours but you also modified the type of "target" and "me" variables :D I have a little doubt: I did (as you said) a check with ">= 0" but probably it can also be "> 0" as looking in cmanccs.c:read_ccs_nodes I see: [...] if (check_nodeids && nodeid == 0) { char message[132]; sprintf(message, "No node ID for %s, run 'ccs_tool addnodeids' to fix", nodename); log_msg(LOG_ERR, message); write_cman_pipe(message); return -1; } [...] so looks like a nodeid should be > 0 (as 0 looks like it's not accepted). What do you think? > > > The same problem is in the ip.sh resource scripts as it's missing the > > patch for "Fix bug in ip.sh allowing start of the IP if the link was > > down, preventing failover (linux-cluster reported)." in 1.5.2.16 of > > RHEL4 branch. > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=222484 > > This one's already got a fix as you said in the RHEL4 branch; we'll use > it. Thanks! Bye! > > -- Lon > > -- > Linux-cluster mailing list > Linux-cluster@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/linux-cluster -- Simone Gotti -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Video Corsi GRATIS - Scopri come imparare velocemente e senza stress (Internet, Informatica, Web Marketing, Hobby Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=5145&d=14-1
Index: src/daemons/rg_state.c =================================================================== RCS file: /cvs/cluster/cluster/rgmanager/src/daemons/rg_state.c,v retrieving revision 1.26 diff -u -b -B -p -r1.26 rg_state.c --- src/daemons/rg_state.c 14 Dec 2006 22:18:07 -0000 1.26 +++ src/daemons/rg_state.c 14 Jan 2007 16:51:32 -0000 @@ -1308,7 +1308,7 @@ handle_relocate_req(char *svcName, int r return RG_EFORWARD; } - if (preferred_target != 0) { + if (preferred_target >= 0) { allowed_nodes = member_list(); /* @@ -1380,7 +1380,7 @@ handle_relocate_req(char *svcName, int r //count_resource_groups(allowed_nodes); } - if (preferred_target != 0) + if (preferred_target >= 0) memb_mark_down(allowed_nodes, preferred_target); memb_mark_down(allowed_nodes, me);
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster