On 11/08/14 09:36, Jan Friesse wrote:
Christine Caulfield napsal(a):
On 11/08/14 09:17, Jan Friesse wrote:
Christine Caulfield napsal(a):
On 11/08/14 08:59, Jan Friesse wrote:
Chrissie,
On 11/08/14 07:29, Jan Friesse wrote:
Chrissie,
patch looks generally good, but is there a reason to add new library
call instead of tracking "quorum.wait_for_all" and if set to 0,
execute
code very similar to
message_handler_req_lib_votequorum_cancel_wait_for_all?
Yes. The point is not to clear wait_for_all itself, that's a
configuration option and we are not changing it - just the runtime
wait
state. The config option needs to remain enabled for the next time
nodes
Yes. But user can call corosync-cmapctl to change this variable. We
don't need to (or want to) to change it via reload. Very similar thing
is happening with expected votes. Take a look to
ed63c812afc15fc68ebd3363845a63f5c945623e (and this was actually
inspiration for what I'm suggesting). wait_for_all is totally same.
Allow natural selection. Dynamic change of "config" (but not stored to
config file).
No, it's not changing the config - even dynamically. It's changing a
state inside corosync, not even a dynamic configuration parameter.
Sure
wait_for_all_status is NOT the same thing as quorum.wait_for_all - not
even slightly. wait_for_all needs to remain set after this command
(whatever it turns out to be) for the next time a node goes down, we do
not want to have to wait for a reload for that to happen.
It will. cmap is NOT stored back into config, so wait_for_all WILL
remain set after this command for the next time a node down (are you
talking about local node, right?). No reload needs to happen.
Indeed, but it's still stored in cmap - and I don't want wait_for_all to
change in cmap. If that was what I wanted then that's what I would have
done.
wait_for_all is not what I'm changing here. It's the internal status
that says we are currently waiting - not that we should wait in future.
Oh. So you are talking about config integrity (current cmap reflect what
is in the config) and not about "technical" problem. It makes sense then.
But I think changing runtime.votequorum.wait_for_all_status give big
problem with recursion (maybe quite hard to solve, especially on
non-local node (node which initiated change)). If so, you can then try
to use similar method as triggering blackbox creation (set something
like runtime.votequorum.cancel_wait_for_all)
Honza
OK try 3..
This one just uses a cmap key to trigger the cancel. Although it looks a
lot neater than the last one (mainly due to the lack of the new API
call) I'm not totally happy using a cmap variable as an edge-trigger.
Still, it's a small patch and keeps the thing undocumented - which is
probably a plus ;-)
Chrissie
diff --git a/exec/votequorum.c b/exec/votequorum.c
index 78e6b7b..6caccaf 100644
--- a/exec/votequorum.c
+++ b/exec/votequorum.c
@@ -150,6 +150,7 @@ static int votequorum_exec_send_quorum_notification(void *conn, uint64_t context
#define VOTEQUORUM_RECONFIG_PARAM_EXPECTED_VOTES 1
#define VOTEQUORUM_RECONFIG_PARAM_NODE_VOTES 2
+#define VOTEQUORUM_RECONFIG_PARAM_CANCEL_WFA 3
static int votequorum_exec_send_reconfigure(uint8_t param, unsigned int nodeid, uint32_t value);
@@ -1487,6 +1488,7 @@ static void votequorum_refresh_config(
{
int old_votes, old_expected_votes;
uint8_t reloading;
+ uint8_t cancel_wfa;
ENTER();
@@ -1498,6 +1500,15 @@ static void votequorum_refresh_config(
return ;
}
+ icmap_get_uint8("quorum.cancel_wait_for_all", &cancel_wfa);
+ if (strcmp(key_name, "quorum.cancel_wait_for_all") == 0 &&
+ cancel_wfa >= 1) {
+ icmap_set_uint8("quorum.cancel_wait_for_all", 0);
+ votequorum_exec_send_reconfigure(VOTEQUORUM_RECONFIG_PARAM_CANCEL_WFA,
+ us->node_id, 0);
+ return;
+ }
+
old_votes = us->votes;
old_expected_votes = us->expected_votes;
@@ -2070,6 +2081,14 @@ static void message_handler_req_exec_votequorum_reconfigure (
recalculate_quorum(1, 0); /* Allow decrease */
break;
+ case VOTEQUORUM_RECONFIG_PARAM_CANCEL_WFA:
+ update_wait_for_all_status(0);
+ log_printf(LOGSYS_LEVEL_INFO, "wait_for_all_status reset by user on node %d.",
+ req_exec_quorum_reconfigure->nodeid);
+ recalculate_quorum(0, 0);
+
+ break;
+
}
LEAVE();
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss