Re: [PATCH 4/6] votequorum: Block sync until qdevice poll

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jason,

Hi Honza,

Sorry, my qdisk have not implemented yet, so I currently could not
provide you any commit message. Attachment is the top level pseudo
code of my qdisk, I think it can almost express my use case and how to
adapt to your new change in the votequorum Please take a look. Thanks!

I'm missing dispatch call (to process events), but generally looks good.

Honza


B.R.,
Jason

On Fri, Aug 8, 2014 at 2:32 PM, jason <huzhijiang@xxxxxxxxx> wrote:
Hi Honza,

On Aug 7, 2014 2:55 PM, "Jan Friesse" <jfriesse@xxxxxxxxxx> wrote:

Jason



Hi Honza,

I think I am clear with this patch after your explanation, thanks!

But if there is a very simple qdevice implementation which voting is
not based on changed membership,but based on the current voting
result, can it simply ignore the feature of this patch by setting


I'm not exactly sure what is such qdevice then good for...


timeout to 0 to keep on the old behavior?  I'm afraid it can't any


That's why I wrote *almost* same behavior


more. Because if set timeout to 0, qdevice_timer_fn() will be called
immediately, which result in us->flags being erased, thus, votequorum
lost the current voting result.  Right?


Right.

But IF you want to use old result for some reason, you can keep default
config as it is and in your qdevice just call poll with new ring id and old
result right after receiving votequorum change. Timeout there should be
almost minimal (from my test with testvotequorum2, it's really few
milliseconds).

I agree with you. This should be the way that my use case adaptes to this
patch to omit the waiting in sync process. So for my case, thers is no need
for you to make patch to reverts new functionality by setting sync_timeout
to 0.


Anyway. I can make patch which reverts new functionality completely if
sync_timeout is set to 0 (that's easy). But can you please explain me a
little more your use case (at least to have something for commit message)?

My use case is a qdevice running in master_win mode. Normally, all qdevices
always poll to votequorum in a constant time cycle(for example, 1sec), but
only one master do cast vote. So after network partition or node down
occured, votequorum can immediately find out the primary component by seeing
which component includes the master. So now I mainly wonder what and how a
qdevice can benifit from deciding to vote or not vote based on changed
membership with the cost of letting sync process to wait the decision?
Althrough it looks my use case is not a good qdevice implemented, but at
least, it can totally omit the waiting in sync process.





Honza



On Wed, Aug 6, 2014 at 5:58 PM, Jan Friesse <jfriesse@xxxxxxxxxx> wrote:

Jason,

Hi Honza,
Could you please explain this more detaily ? Why should corosync block
until it got poll from qdevice(or timeout)? Why not simply using the
information provided by the latest poll? For example, in master win
mode,


Because qdevice may decide to not vote based on changed membership. I
will give you example:

- Qdevice votes
- membership change
- sync
- sync completed
- Qdevice decides to not vote

Now, between membership change and qdevice decision may be quite a long
time when votequorum uses incorrect information. Patch simply prevents
this race condition.

infomation about who is master is already on hand when network split
happen. So it can be used immediately to choose the primary component
without any delay.  But the blocking method in this patch will
introduce


It has not too much to do with master...

delay on deciding primay component, which may cause upper level
application
stop providing service in a longer time, doesn't it?


It has again nothing to do with primary component. But yes, patch can
cause upper level application to stop providing service for quite a long
time (max 30 sec by default). That's why sync timeout is configurable
(you can set 0 and you have almost old behavior) and why qdevice must be
developed with this problem in mind. Meaning, qdevice must be as fast as
possible.

Did it explained your questions?

Regards,
    Honza

On Aug 5, 2014 6:10 PM, "Jan Friesse" <jfriesse@xxxxxxxxxx> wrote:

If qdevice is registered a alive, corosync waits in sync phase until
timeout expires or qdevice votes with correct nodeid parameter.

This gives qdevice time to decide to vote or not undisturbed and
without
time hazard.

Signed-off-by: Jan Friesse <jfriesse@xxxxxxxxxx>
---
   exec/votequorum.c |   48
+++++++++++++++++++++++++++++++++++++++++++-----
   1 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/exec/votequorum.c b/exec/votequorum.c
index dd5bea7..7781077 100644
--- a/exec/votequorum.c
+++ b/exec/votequorum.c
@@ -234,6 +234,8 @@ static corosync_timer_handle_t qdevice_timer;
   static int qdevice_timer_set = 0;
   static corosync_timer_handle_t last_man_standing_timer;
   static int last_man_standing_timer_set = 0;
+static int sync_nodeinfo_sent = 0;
+static int sync_wait_for_poll_or_timeout = 0;

   /*
    * Service Interfaces required by service_message_handler struct
@@ -310,6 +312,8 @@ static int quorum_lib_init_fn (void *conn);

   static int quorum_lib_exit_fn (void *conn);

+static void qdevice_timer_fn(void *arg);
+
   static void message_handler_req_lib_votequorum_getinfo (void *conn,
                                                          const void
*message);

@@ -2182,6 +2186,8 @@ static void votequorum_sync_init (
          ENTER();

          sync_in_progress = 1;
+       sync_nodeinfo_sent = 0;
+       sync_wait_for_poll_or_timeout = 0;

          if (member_list_entries > 1) {
                  us->flags &= ~NODE_FLAGS_FIRST;
@@ -2231,17 +2237,46 @@ static void votequorum_sync_init (
          quorum_members_entries = member_list_entries;
          memcpy(&quorum_ringid, ring_id, sizeof(*ring_id));

+       if (us->flags & NODE_FLAGS_QDEVICE_REGISTERED && us->flags &
NODE_FLAGS_QDEVICE_ALIVE) {
+               /*
+                * Reset poll timer. Sync waiting is interrupted on
valid
qdevice poll or after timeout
+                */
+               if (qdevice_timer_set) {
+                       corosync_api->timer_delete(qdevice_timer);
+               }
+               corosync_api->timer_add_duration((unsigned long
long)qdevice_timeout*1000000, qdevice,
+                                                qdevice_timer_fn,
&qdevice_timer);
+               qdevice_timer_set = 1;
+               sync_wait_for_poll_or_timeout = 1;
+
+               log_printf(LOGSYS_LEVEL_INFO, "waiting for quorum
device
%s poll (but maximum for %u ms)",
+                       qdevice_name, qdevice_timeout);
+       }
+
          LEAVE();
   }

   static int votequorum_sync_process (void)
   {
-       votequorum_exec_send_nodeinfo(us->node_id);
-       votequorum_exec_send_nodeinfo(VOTEQUORUM_QDEVICE_NODEID);
-       if (strlen(qdevice_name)) {
-

votequorum_exec_send_qdevice_reg(VOTEQUORUM_QDEVICE_OPERATION_REGISTER,
-                                                qdevice_name);
+
+       if (!sync_nodeinfo_sent) {
+               votequorum_exec_send_nodeinfo(us->node_id);
+
votequorum_exec_send_nodeinfo(VOTEQUORUM_QDEVICE_NODEID);
+               if (strlen(qdevice_name)) {
+

votequorum_exec_send_qdevice_reg(VOTEQUORUM_QDEVICE_OPERATION_REGISTER,
+
qdevice_name);
+               }
+               sync_nodeinfo_sent = 1;
          }
+
+       if (us->flags & NODE_FLAGS_QDEVICE_REGISTERED &&
sync_wait_for_poll_or_timeout) {
+               /*
+                * Waiting for qdevice to poll with new ringid or
timeout
+                */
+
+               return (-1);
+       }
+
          return 0;
   }

@@ -2336,6 +2371,7 @@ static void qdevice_timer_fn(void *arg)
          votequorum_exec_send_nodeinfo(us->node_id);

          qdevice_timer_set = 0;
+       sync_wait_for_poll_or_timeout = 0;

          LEAVE();
   }
@@ -2675,6 +2711,7 @@ static void
message_handler_req_lib_votequorum_qdevice_unregister (void *conn,
                  if (qdevice_timer_set) {
                          corosync_api->timer_delete(qdevice_timer);
                          qdevice_timer_set = 0;
+                       sync_wait_for_poll_or_timeout = 0;
                  }
                  us->flags &= ~NODE_FLAGS_QDEVICE_REGISTERED;
                  us->flags &= ~NODE_FLAGS_QDEVICE_ALIVE;
@@ -2777,6 +2814,7 @@ static void
message_handler_req_lib_votequorum_qdevice_poll (void *conn,
                  corosync_api->timer_add_duration((unsigned long
long)qdevice_timeout*1000000, qdevice,
                                                   qdevice_timer_fn,
&qdevice_timer);
                  qdevice_timer_set = 1;
+               sync_wait_for_poll_or_timeout = 0;
          } else {
                  error = CS_ERR_NOT_EXIST;
          }
--
1.7.1

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss











_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux